CN114581991A - Behavior attitude identification method based on dynamic perception of facial expressions - Google Patents

Behavior attitude identification method based on dynamic perception of facial expressions Download PDF

Info

Publication number
CN114581991A
CN114581991A CN202210219980.9A CN202210219980A CN114581991A CN 114581991 A CN114581991 A CN 114581991A CN 202210219980 A CN202210219980 A CN 202210219980A CN 114581991 A CN114581991 A CN 114581991A
Authority
CN
China
Prior art keywords
face
displacement
facial
time
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210219980.9A
Other languages
Chinese (zh)
Inventor
余伟
余放
李宇轩
李石君
杨弋
卢可
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Hangjun Technology Co ltd
Wuhan University WHU
Original Assignee
Wuhan Hangjun Technology Co ltd
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Hangjun Technology Co ltd, Wuhan University WHU filed Critical Wuhan Hangjun Technology Co ltd
Priority to CN202210219980.9A priority Critical patent/CN114581991A/en
Publication of CN114581991A publication Critical patent/CN114581991A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a behavior attitude identification method based on facial expression dynamic perception, which comprises the following steps of S1: the face detection and positioning are carried out through the face positioning of the expression image, the face cutting is carried out after the exact position of the face is found in the image, other interference information is thoroughly eliminated, and the face in the image is highlighted through the image enhancement; s2, establishing a face time sequence feature evolution model; and S3, judging the answer attitude according to the dynamic time sequence characteristics of the face. The invention analyzes facial expressions through an artificial intelligence technology, realizes the recognition of behavior attitude, can effectively understand the real inner feeling of a user, can be widely applied to businesses such as marital relation prediction, communication negotiation, teaching evaluation and the like, can find the real intention of the user through analyzing the expressions of the user, can timely stop illegal behaviors of dangerous molecules, and can well predict whether a prisoner lies, whether violent behaviors exist and the like, thereby protecting the long-term security of the country.

Description

Behavior attitude identification method based on facial expression dynamic perception
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a behavior attitude identification method based on dynamic perception of facial expressions.
Background
Facial expression recognition can recognize human facial expressions such as surprise, sadness, happiness, anger and the like, and there is a wide range of potential applications for facial expression recognition. Expression is the intuitive reaction of human emotion, and expression recognition is always one of important research subjects of computer vision. Researchers have achieved significant results in various expression recognition problems over the past few decades. One method is that a computer visual characteristic is constructed, an efficient expression form is found to describe the expression and model learning is carried out; and the other method is to search a proper learning algorithm for constructing the model according to the characteristics of the expression. In recent years, studies have been made to encode 6 expressions such as anger, disgust, fear, happiness, sadness, surprise, and the like, which are proposed by a Facial Action Coding System (FACS), and to recognize the expression of a human face in a picture or a video. The main technologies at present are as follows:
1. LBP-TOP-based identification method
A 68-point subjective shape model is used to locate key points of the face. And calculating the deformation relation between the face image in the first frame of each sequence and the model face image by using a local weighted average algorithm on the basis of the obtained key points, and applying the deformation to each frame image of the corresponding sequence. This eliminates to some extent the differences between different faces and different sequences in an expressionless state.
2. Identification method based on STCLQP
The complete local quantization mode is an improvement of LBP. Different from the relationship of the gray value of only the local pixel coded by the LBP, the complete local quantization mode decomposes the local symbiotic mode of the central pixel and the surrounding pixels into positive and negative signs and amplitude values, adds the gradient information of the central pixel, and respectively codes by binary numbers. In the stage of constructing a statistical histogram, in order to reduce the dimension of a feature, the complete local quantization mode does not count all possible binary codes, but considers the most frequently occurring binary mode, and introduces a vector quantization technique, which can specify the number of centers (the number of words in a codebook) in the quantization process and obtain a histogram with specified dimension as the feature.
3. LBP-SIP-based identification method
Unlike the expression recognition work based on LBP-TOP improvement, the six-intersection local binary pattern expands the LBP characteristics from another perspective for micro-expression recognition. The LBP-SIP uses four points on the same plane of the central point as space texture description, and the central points of the front frame and the rear frame as time texture description, thereby reducing the dimension of the characteristic and improving the efficiency of characteristic extraction.
4. Delaunay time domain coding-based identification method
The time domain coding model based on Delaunay triangulation utilizes a subjective appearance model to calibrate the face image sequence. Because the variation range of the expression is very small, the expression change can not be well described only by using key points, and therefore, the sequence images are normalized by using the characteristic points to obtain a face image sequence with fixed characteristic point positions. Delaunay triangulation may divide a face into a series of triangular regions based on given feature points. Because the feature points have been normalized, the size and shape of each triangular region is the same, with the same number of pixels. By comparing the changes of the same area with time, the dynamic process of the expression can be described.
However, these methods are susceptible to illumination changes and the like, and face the problem of insufficient robustness in a real scene, and these feature extraction methods mainly depend on the prior knowledge of designers, and need manual parameter adjustment, so that only a small number of parameters are allowed to appear in the design of features, which may result in a greatly reduced recognition rate.
We propose a behavioral attitude recognition method based on dynamic perception of facial expressions in order to solve the problems set forth above.
Disclosure of Invention
The invention aims to provide a behavior attitude identification method based on dynamic perception of facial expressions, so as to solve the problems in the background technology. The invention judges the answering attitude of the on-line answering person according to the human face displacement characteristics, thereby indirectly judging the single body credibility. On-line answering is a dynamic process, so that facial time sequence data in the answering process of an answerer needs to be collected and converted into a key point displacement time sequence. And carrying out evolution disturbance on the time sequence through a Markov theory to generate a displacement random evolution sequence of the key points of the face, wherein the sequence has smaller difference with the original sequence and can form random disturbance on the original data in a longer time span. And constructing a splitting loss function by using the evolution sequence as a heterocons multiplication component, wherein the loss function can enable the trained model to obtain stronger generalization capability. The splitting loss function can make the displacement time sequences of the facial key points with the same attitude similar as much as possible in the training process, and make the corresponding sequences with different attitudes separated as much as possible. The encoder trained by the split loss function is combined with the logstic function, so that the capability of judging the answering attitude of an answerer is realized.
In order to achieve the purpose, the invention provides the following technical scheme: the behavior attitude identification method based on the dynamic perception of the facial expression comprises the following steps:
s1, face image preprocessing: the face detection and positioning are carried out through the face positioning of the expression image, the face cutting is carried out after the exact position of the face is found in the image, other interference information is thoroughly eliminated, and the face in the image is highlighted through the image enhancement;
s2, establishing a face time sequence feature evolution model;
and S3, judging the answer attitude according to the dynamic time sequence characteristics of the face.
Preferably, in step S1, the image enhancement may stretch the grayscale details of the feature object according to the needs of the user through piecewise linear transformation, and after dividing the whole grayscale interval of 0 to 255 into a plurality of line segments according to needs, perform corresponding linear transformation on each line segment; the linear transformation formula is as follows:
Figure BDA0003536793980000031
preferably, in step S1, in the facial positioning of the expression image, on the basis of performing piecewise linear transformation on the image, a gray scale integral projection method is used to perform the positioning of the facial area, including a horizontal integral projection and a vertical integral projection;
setting the size of the gray image as M multiplied by N, and setting f (x, y) as the gray value at the middle point (x, y) of the image;
its vertical integral projection is then:
Figure BDA0003536793980000041
the horizontal integral projection is:
Figure BDA0003536793980000042
and carrying out scale normalization processing on the image to process the image into a uniform size, carrying out image scaling according to a proportion, and cutting and size normalization on the boundary to obtain the facial expression image.
Preferably, in step S2, the establishing a face temporal feature evolution model includes the following:
s20, face time series data: the face time sequence data of the answering person refers to the record of the change of pixels in a face area along with the time within the answering time of the answering person in unit question quantity;
s21, face time series feature: the face time sequence data is dynamic data which contains face dynamic characteristics;
s22, coordinates and displacement of the face key points: the face key points refer to a set of key pixels which can be calibrated at relative positions of features, and the key points can be calibrated manually or given by weight of machine learning;
s23, facial key point displacement time sequence: the facial key point displacement time sequence is a sequence of the coordinate displacement of a person's facial key point along with the change of time in unit quantity answering time and is recorded in a discrete vector form; defining the facial keypoint shift time series using the following formula:
x(0)=[Δx(t1),Δx(t2),···,Δx(tn)] (4);
wherein x (0) represents the original displacement vector sequence of the same key point pixel at n moments, and the element Δ x (t)i) Indicating the direction of displacement of the pixel at the time;
s24, randomly evolving the displacement of the facial key points: the answerers divide the test attitude into positive and negative types, and the deviation is utilized to apply disturbance to the x (0), so that the evolved displacement sequence has certain difference with the original sequence and influences a period of time;
s25, initial distribution of displacement of the facial key points: the definition of the initial distribution of the displacement of the facial key points is as follows:
φ(t0)=[φ12,···,φN] (5);
wherein, the vector phi represents the initial displacement distribution of the displacement of the facial key points, and the facial key points are recorded at the initial time t0Probability of each mark position, and the internal component represents the probability of the face key point at the ith position;
the evolution matrix of the displacement of the facial key points is as follows:
Figure BDA0003536793980000051
wherein S represents a displacement evolution matrix of the facial key points; i rows and j columns of elements SijThe probability of random disturbance represents the probability that the displacement of the facial key point is shifted to the key position j from the key position i through a unit moment, and can be obtained by counting the frequency that the facial key point falls to a new mark position when the answerer finishes answering;
s26, face key point multiple displacement distribution: according to the chepman-cole moguov equation, the following is shown:
Figure BDA0003536793980000052
wherein phi (t)0) Is the displacement distribution of key points m times, which represents the displacement distribution estimation of the key points of the face of the answerer after m unit moments from the initial moment, and the element phi isi(m) represents the probability that the rear key point is at the ith position after m unit moments, and the multiple displacement distribution is estimated by the above Markov process with continuous time and state;
s27, randomly evolving the displacement of the key points of the face: the random evolution of the facial time sequence features refers to the probability that the same facial key point moves towards the key position direction in the mark position at a certain moment.
Preferably, in step S22, after compressing the video of the face time series data to a uniform standard aspect ratio, the coordinates of each pixel in the same key point are called a face key point coordinate set;
Figure BDA0003536793980000061
wherein XmRepresenting a set of coordinates of facial key points, elements of the set
Figure BDA0003536793980000062
A coordinate representing a kth pixel;
Figure BDA0003536793980000063
wherein, Δ Xm(t) a set of displacement vectors representing the coordinates of the key points from time t; element(s)
Figure BDA0003536793980000064
Indicating the displacement direction of the kth pixel, with the coordinate at time t +1 as the end point and the coordinate at time t as the start point.
Preferably, in step S27, the randomly evolving sequence of the displacements of the facial key points is as follows:
Figure BDA0003536793980000065
wherein, the random evolution model of the facial time sequence feature is a transformation of an original displacement vector sequence x (0) to m displacement sequences x (m); the transformation is to apply an N-dimensional probability distribution of displacement to each component in x (0), with the components being replaced by randomly evolving directions depending on the probability.
Preferably, in step S3, the answer attitude determination according to the dynamic time-series feature of the face includes the following contents:
s30, constructing a same attitude sample pair: counting a displacement time sequence x (0) of the key points of the face of a certain positive attitude sample according to the step S23; calculating a random evolution sequence x (m) of the displacement of the corresponding facial key points according to the step S27 to form a group of sample pairs; constructing corresponding evolution sequences for all samples with active attitude tags;
s31, constructing a splitting loss function: the splitting loss function is designed to have a mode of distinguishing sample pairs from non-sample pairs, so that the face displacement sequences in the same attitude can be gathered together as much as possible, and the sequences from different attitudes can be separated;
s32, attitude judgment: and acquiring minimum value points of the splitting loss function, and inputting a new face time sequence into the trained R [ ] + logstic linear classifier structure.
Preferably, in step S31, the splitting loss function is defined as follows:
Figure BDA0003536793980000071
wherein, L (x)i) Representing a cleavage loss function, R [, ]]The neural network is a backbone neural network with parameters, and a plurality of existing specific network forms can be selected according to requirements.
Compared with the prior art, the invention has the beneficial effects that:
the invention analyzes facial expressions through an artificial intelligence technology, realizes the recognition of behavior attitude, can effectively understand the real inner feeling of a user, can be widely applied to businesses such as marital relation prediction, communication negotiation, teaching evaluation and the like, can find the real intention of the user through analyzing the expressions of the user, can timely stop illegal behaviors of dangerous molecules, and can well predict whether a prisoner lies, whether violent behaviors exist and the like, thereby protecting the long-term security of the country.
Drawings
FIG. 1 is a general flow chart of the behavioral attitude recognition method based on dynamic perception of facial expressions according to the present invention;
FIG. 2 is a random evolution diagram of displacements of key points of a face based on the behavior attitude identification method of dynamic perception of facial expressions.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1-2, the present invention provides a technical solution: the behavior attitude identification method based on the dynamic perception of the facial expressions comprises the following steps:
firstly, preprocessing a face image.
1. Image enhancement
The purpose of image enhancement is mainly to highlight the face in the image, the gray level details of the characteristic object can be stretched according to the needs of a user through piecewise linear transformation, the whole gray level interval of 0-255 is divided into a plurality of line segments according to the needs, and then each line segment is subjected to corresponding linear transformation. The mathematical transformation formula is as follows:
Figure BDA0003536793980000081
2. facial positioning of expression images
The important step in the expression image preprocessing is the detection and the positioning of the face, which are used for finding the exact position of the face in the image, then the face is cut, other interference information is thoroughly eliminated, and only the face information is left for subsequent analysis. On the basis of piecewise linear transformation of the image, a gray scale integral projection method is adopted to locate the facial area, including a horizontal integral projection and a vertical integral projection.
Setting the size of the gray image as M multiplied by N, and setting f (x, y) as the gray value at the middle point (x, y) of the image;
its vertical integral projection is then:
Figure BDA0003536793980000091
the horizontal integral projection is:
Figure BDA0003536793980000092
in order to obtain complete information of facial expressions, the face boundary map obtained by gray scale integral projection is slightly modified, i.e. the upper and lower boundaries are appropriately expanded. Because the face images cut by the gray scale integral projection are different in size, the images need to be subjected to scale normalization processing so as to be processed into a uniform size, and subsequent related work such as feature extraction is facilitated. And zooming the image according to the proportion, and then cutting and normalizing the boundary to obtain the facial expression image.
Secondly, establishing a face time sequence feature evolution model
1. Face time series data
The face time sequence data of the answering person refers to the record of the change of pixels in a face area along with the time within the answering time of the answering person in unit question amount. The unit question amount is the answer amount which contains at least one feedback operation of the answering person and has measurement significance, and can be set to be a half question or a single question according to a specific task, or a certain period of time of the question which takes longer time is specified. One topic can be generally selected as the unit topic quantity. The time starting point of the unit quantity is the time when the theme appears on the display of the subject, the time end point of the unit quantity is the time when the subject performs effective feedback operation, the time difference between the former and the latter is the answering time of the unit quantity, and the time sequence data is recorded and stored in a video format.
2. Facial timing features
Face recognition techniques exist to extract 2D static features of the face. The face 3D recognition technology is to collect dynamic data through a structured light phase or an optical depth technology to establish 3D face static data and extract 3D static features. The face time sequence data is dynamic data and comprises face dynamic characteristics, and generalized face key point displacement time sequences are established in the research to represent the characteristics of the face dynamic.
3. Face key point coordinates and displacements
Facial keypoints refer to a set of keypoints that can be labeled with relative positions across features, and the keypoints can be labeled by human or machine-learned weights. After the video of the face time sequence data is compressed to a uniform standard length-width ratio, the coordinate of each pixel in the same key point is called a face key point coordinate set.
Defining: facial keypoint coordinate set
Figure BDA0003536793980000101
Wherein, XmRepresenting a set of coordinates of facial key points, elements of the set
Figure BDA0003536793980000102
Indicating the coordinates of the kth pixel. Facial temporal features are a dynamic feature, and therefore focus on dynamic displacement changes of pixels within a set.
Defining: displaced set of facial keypoints
Figure BDA0003536793980000103
Wherein, Δ Xm(t) a set of displacement vectors representing the coordinates of the key points at time t; element(s)
Figure BDA0003536793980000104
Indicating the displacement direction of the kth pixel, with the coordinate at time t +1 as the end point and the coordinate at time t as the start point.
4. Facial key point displacement time series
The facial key point displacement time sequence is a sequence of coordinate displacement of a person's facial key points changing along with time in unit order quantity answering time, and is recorded in a discrete vector (array) form.
Defining: facial key point displacement time series
x(0)=[Δx(t1),Δx(t2),···,Δx(tn)] (4);
Wherein x (0) represents the original displacement vector sequence of the same key point pixel at n moments, and the element deltax (t)i) Indicating the direction of displacement of the pixel at the instant. The displacement time sequence of the facial key points does not record any coordinates, because each element of the facial key points is a displacement vector, the coordinates of the corresponding moment can be obtained by the addition of the displacement vectors as long as the initial coordinates of the key points are determined. This sequence characterizes the dynamic features of the respondent's facial keypoints during the response.
5. Random evolution of facial keypoint displacements
The respondents are classified into positive cooperation and negative resistance to the test attitude. Hereinafter, the positive and negative are simply referred to in order. The research assumes that the two attitudes have certain deviation on the face of each individual, namely that the facial time sequence characteristics of the respondents with the same attitude cannot be completely the same, but the facial time sequence characteristics of the respondents with the same attitude have certain commonality on the whole, so that the difference of the individuals with the same attitude is considered as a random evolution deviation of an ideal state. The invention utilizes the deviation to apply disturbance to x (0), so that the evolved displacement sequence has certain difference with the original sequence and is influenced for a long time. This random perturbation will allow the model to gain generalization capability in the subsequent training phase.
6. Initial distribution of displacement of key points of face
Defining: initial distribution of displacement of facial key points
Figure BDA0003536793980000111
Wherein S represents a displacement evolution matrix of the facial key points; i rows and j columns of elements SijThe probability of the displacement of the face key point is a random disturbance probability which represents the probability that the displacement of the face key point is displaced from the key position i to the key position j through a unit moment, and can be obtained by counting the frequency of the face key point falling to a new mark position when the answerer finishes answering.
7. Multiple displacement distribution of facial key points
According to the Chepmann-Col Morgonov equation:
Figure BDA0003536793980000112
wherein phi (t)0) Is the displacement distribution of key points m times, which represents the displacement distribution estimation of the key points of the face of the answerer after m unit moments from the initial moment, and the element phi isi(m) represents the posterior key after m unit time pointsThe probability that a point is at the ith position, the multiple displacement distribution is estimated by the above Markov process of the time and state continuous type.
8. Random evolution of displacement of facial keypoints
The random evolution of the facial time sequence characteristics refers to the probability that the same facial key point moves towards the key position direction in the mark position at a certain moment, and the probability changes along with the change of time.
Defining: displacement random evolution sequence of facial key points
Figure BDA0003536793980000121
Wherein, the random evolution model of the facial time sequence feature is a transformation of an original displacement vector sequence x (0) to m displacement sequences x (m); the transformation is to apply an N-dimensional probability distribution of displacement to each component in x (0), with the components being replaced by randomly evolving directions depending on the probability.
In particular, the probability value at which any displacement component is replaced follows the distribution law φ (t)m) The direction of displacement is directed to phi (t)m) The middle probability component. The degree of displacement is proportional to the distance of the center point of the mark position relative to the current coordinates. Phi (t) calculated since the facial movements of the respondents have a certain commonality in totalm) So that x (0) is given a predetermined displacement by phi (t)m) The mid-position replacement is a small probability event, x (m) is similar to the sequence of x (0) after m times of displacement in general, except that the replacement occurs in individual components, which is called x (m) as a randomly evolved sequence of facial temporal features after m times of displacement.
Thirdly, answer attitude judgment according to the dynamic time sequence characteristics of the face
1. Constructing pairs of same-attitude samples
The first step in the answer attitude determination is the evolution of facial time series data. The positive attitude samples are taken as an example for explanation, and the negative attitude samples are vice versa. Counting a facial key point displacement time sequence x (0) of a certain positive attitude sample according to the above, and calculating a corresponding displacement random evolution sequence x (m) of the facial key point to form a group of sample pairs; and constructing corresponding evolutionary sequences for all samples with positive attitude tags.
2. Constructing a splitting loss function
The split loss function is designed to have a form of distinguishing sample pairs from non-sample pairs, and it can group together as much as possible face displacement sequences in the same attitude, and separate sequences from different attitudes.
Defining: splitting loss function
Figure BDA0003536793980000131
Wherein, L (x)i) Representing a cleavage loss function, R [, ]]The method is a backbone neural network with parameters, and can select various existing specific network forms (such as ResNet, VGG-19, and when the number of answers is large or the answer time is long, a serialization network (such as GRU, LSTM, BERT) can be selected as required to be used as a dot product of a subsection in an encoder logarithmic function of a face displacement time sequence and a corresponding evolution sequence, and to be used as a vector dot product of a sample sequence and the corresponding evolution sequence after the sample sequence and the corresponding evolution sequence are output by an encoder. When the fragmentation function is more optimized, R2]The middle parameter will make the point product value of the homomorphism sample larger, i.e. the similarity is higher. The point multiplication term of the accumulation part of the denominator is the point multiplication of the original sequence and the sample evolution sequence with different attitudes, so that when the splitting loss is reduced, R [ 2 ]]The medium parameters can reduce the dot product value of the part as much as possible, reduce the similar pairs of the partial dot product values and split the displacement sequences with different attitudes. Therefore, the optimization process of the splitting function enables the encoder R to extract the common mode of the displacement of the key points of the faces of the respondents with the same attitude and separate the non-common modes of the faces of the respondents with different attitudes. These patterns are dynamic time series at the input and are therefore a dynamic pattern.
3. Attitude determination
The main part of the splitting loss function is a factor similar to the softmax function, so the minimum point of the splitting loss function can be obtained by various existing optimization methods (e.g. ADAM, SGD). The encoder R after training becomes the input end of the discriminator, and the output end is connected with a two-class logstic linear classifier, so that the discrimination can be carried out on the new face key point time sequence data. Inputting a new face time sequence into a trained R [ ] + logstic linear classifier structure to obtain an attitude judgment result, wherein the whole training process (a split loss function optimization process) can be found, only less label data is needed, most data are evolved from original data, the nature of evolution is to add a random disturbance which can not cause the whole to deviate from a larger value to an original time sequence, and the disturbance acts on a long period of time sequencer, so that the time sequence can also have generalization during model training even if the time sequence represents a long-time dynamic displacement mode.
Although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that various changes in the embodiments and/or modifications of the invention can be made, and equivalents and modifications of some features of the invention can be made without departing from the spirit and scope of the invention.

Claims (8)

1. The behavior attitude identification method based on the dynamic perception of the facial expressions is characterized by comprising the following steps of:
s1, face image preprocessing: the face detection and positioning are carried out through the face positioning of the expression image, the face cutting is carried out after the exact position of the face is found in the image, other interference information is thoroughly eliminated, and the face in the image is highlighted through the image enhancement;
s2, establishing a face time sequence feature evolution model;
and S3, judging the answer attitude according to the dynamic time sequence characteristics of the face.
2. A behavioral attitude recognition method based on dynamic perception of facial expressions according to claim 1, characterized in that in step S1, the image enhancement can stretch the grayscale details of the feature object according to the needs of the user through piecewise linear transformation, after dividing the whole grayscale interval of 0-255 into several line segments according to the needs, perform corresponding linear transformation on each line segment; the linear transformation formula is as follows:
Figure FDA0003536793970000011
3. a behavioral attitude recognition method based on dynamic perception of facial expressions according to claim 1, characterized in that in step S1, facial positioning of the expression image is performed by using a gray scale integral projection method based on piecewise linear transformation of the image, including horizontal integral projection and vertical integral projection;
setting the size of the gray image as M multiplied by N, and setting f (x, y) as the gray value at the midpoint (x, y) of the image;
its vertical integral projection is then:
Figure FDA0003536793970000012
the horizontal integral projection is:
Figure FDA0003536793970000013
and carrying out scale normalization processing on the image to process the image into a uniform size, carrying out image scaling according to a proportion, and cutting and size normalization on the boundary to obtain the facial expression image.
4. A behavioral attitude recognition method based on dynamic perception of facial expressions according to claim 1, characterized in that in step S2, the establishing of the evolution model of facial temporal features includes the following steps:
s20, face time series data: the face time sequence data of the answering personnel refers to the record of the change of pixels in a face area along with the time within the answering time of the answering personnel in unit number of questions;
s21, face time series feature: the face time sequence data is dynamic data which contains face dynamic characteristics;
s22, coordinates and displacement of the face key points: the face key points refer to a set of key pixels which can be calibrated at relative positions of features, and the key points can be calibrated manually or given by weight of machine learning;
s23, facial key point displacement time sequence: the facial key point displacement time sequence is a sequence of the coordinate displacement of a person's facial key point along with the change of time in unit quantity answering time and is recorded in a discrete vector form; defining the facial keypoint shift time series using the following formula:
x(0)=[Δx(t1),Δx(t2),···,Δx(tn)] (4);
wherein x (0) represents the original displacement vector sequence of the same key point pixel at n moments, and the element Δ x (t)i) Indicating the direction of displacement of the pixel at the time;
s24, randomly evolving the displacement of the facial key points: the answerer divides the attitude of the test into positive and negative types, and the deviation is utilized to apply disturbance to the x (0), so that the evolved displacement sequence has certain difference with the original sequence and influences a period of time;
s25, initial distribution of facial key point displacement: the definition of the initial distribution of the displacement of the facial key points is as follows:
φ(t0)=[φ12,···,φN] (5);
wherein, the vector phi represents the displacement initial displacement distribution of the facial key points and records the facial key points at the initial time t0Probability of each mark position, and the internal component represents the probability of the face key point at the ith position;
the evolution matrix of the displacement of the facial key points is as follows:
Figure FDA0003536793970000031
wherein S represents a displacement evolution matrix of the facial key points; i rows and j columns of elements SijThe probability of random disturbance represents the probability that the displacement of the facial key point is shifted to the key position j from the key position i through a unit moment, and can be obtained by counting the frequency that the facial key point falls to a new mark position when the answerer finishes answering;
s26, face key point multiple displacement distribution: according to the chepman-cole moguov equation, the following is shown:
Figure FDA0003536793970000032
wherein phi (t)0) Is the displacement distribution of key points for m times, and represents the displacement distribution estimation of the key points of the face of the answerer after m unit moments from the initial moment, and the element phii(m) represents the probability that the rear key point is at the ith position after m unit moments, and the multiple displacement distribution is estimated by the above Markov process with continuous time and state;
s27, randomly evolving the displacement of the key points of the face: the random evolution of the facial time sequence features refers to the probability that the same facial key point moves towards the key position direction in the mark position at a certain moment.
5. A behavioral attitude recognition method based on dynamic perception of facial expressions according to claim 4, characterized in that: in step S22, after compressing the video of the face time series data to a uniform standard aspect ratio, the coordinates of each pixel in the same key point are called a face key point coordinate set;
Figure FDA0003536793970000041
wherein XmRepresenting a set of facial keypoint coordinatesSynthesis of elements in a set
Figure FDA0003536793970000042
Coordinates representing a k-th pixel;
Figure FDA0003536793970000043
wherein, Δ Xm(t) a set of displacement vectors representing the coordinates of the key points from time t; element(s)
Figure FDA0003536793970000045
Indicating the displacement direction of the kth pixel, with the coordinate at time t +1 as the end point and the coordinate at time t as the start point.
6. A behavioral attitude recognition method based on dynamic perception of facial expressions according to claim 4, characterized in that in step S27, the random evolution sequence of the displacements of the facial key points is as follows:
Figure FDA0003536793970000044
wherein, the random evolution model of the facial time sequence feature is a transformation of an original displacement vector sequence x (0) to m displacement sequences x (m); the transformation is to apply an N-dimensional probability distribution of displacement to each component in x (0), with the components being replaced by randomly evolving directions depending on the probability.
7. A behavioral attitude recognition method based on dynamic perception of facial expressions according to claim 6, wherein in step S3, said answer attitude determination based on time-series characteristics of facial dynamics includes the following:
s30, constructing a same attitude sample pair: counting a displacement time sequence x (0) of the key points of the face of a certain positive attitude sample according to the step S23; calculating a random evolution sequence x (m) of the displacement of the corresponding facial key points according to the step S27 to form a group of sample pairs; constructing corresponding evolution sequences for all samples with active attitude tags;
s31, constructing a splitting loss function: the splitting loss function is designed to have a form of distinguishing sample pairs from non-sample pairs, so that the face displacement sequences in the same attitude can be gathered together as much as possible, and the sequences from different attitudes can be separated;
s32, attitude judgment: and acquiring minimum value points of the splitting loss function, and inputting a new face time sequence into the trained R [ ] + logstic linear classifier structure.
8. A behavioral attitude recognition method based on dynamic perception of facial expressions according to claim 7, characterized in that in step S31, the splitting loss function is defined as follows:
Figure FDA0003536793970000051
wherein, L (x)i) The function of the loss of the split is expressed,
Figure FDA0003536793970000052
the neural network is a backbone neural network with parameters, and a plurality of existing specific network forms can be selected according to requirements.
CN202210219980.9A 2022-03-08 2022-03-08 Behavior attitude identification method based on dynamic perception of facial expressions Pending CN114581991A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210219980.9A CN114581991A (en) 2022-03-08 2022-03-08 Behavior attitude identification method based on dynamic perception of facial expressions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210219980.9A CN114581991A (en) 2022-03-08 2022-03-08 Behavior attitude identification method based on dynamic perception of facial expressions

Publications (1)

Publication Number Publication Date
CN114581991A true CN114581991A (en) 2022-06-03

Family

ID=81773718

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210219980.9A Pending CN114581991A (en) 2022-03-08 2022-03-08 Behavior attitude identification method based on dynamic perception of facial expressions

Country Status (1)

Country Link
CN (1) CN114581991A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117373100A (en) * 2023-12-08 2024-01-09 成都乐超人科技有限公司 Face recognition method and system based on differential quantization local binary pattern

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117373100A (en) * 2023-12-08 2024-01-09 成都乐超人科技有限公司 Face recognition method and system based on differential quantization local binary pattern
CN117373100B (en) * 2023-12-08 2024-02-23 成都乐超人科技有限公司 Face recognition method and system based on differential quantization local binary pattern

Similar Documents

Publication Publication Date Title
KR100969298B1 (en) Method For Social Network Analysis Based On Face Recognition In An Image or Image Sequences
CN112560810B (en) Micro-expression recognition method based on multi-scale space-time characteristic neural network
CN111209962B (en) Combined image classification method based on CNN (CNN) feature extraction network and combined heat map feature regression
Caldara et al. Simulating the ‘other-race’effect with autoassociative neural networks: further evidence in favor of the face-space model
CN112464808A (en) Rope skipping posture and number identification method based on computer vision
CN116311483B (en) Micro-expression recognition method based on local facial area reconstruction and memory contrast learning
CN113435335B (en) Microscopic expression recognition method and device, electronic equipment and storage medium
Wang et al. SmsNet: A new deep convolutional neural network model for adversarial example detection
CN116110089A (en) Facial expression recognition method based on depth self-adaptive metric learning
CN114973383A (en) Micro-expression recognition method and device, electronic equipment and storage medium
Singh et al. Age, gender prediction and emotion recognition using convolutional neural network
CN111968124A (en) Shoulder musculoskeletal ultrasonic structure segmentation method based on semi-supervised semantic segmentation
CN114581991A (en) Behavior attitude identification method based on dynamic perception of facial expressions
CN112380374B (en) Zero sample image classification method based on semantic expansion
Abdallah et al. Facial-expression recognition based on a low-dimensional temporal feature space
Sun et al. Using backpropagation neural network for face recognition with 2D+ 3D hybrid information
Verma et al. Hmm-based convolutional lstm for visual scanpath prediction
Diana et al. Cognitive-affective emotion classification: Comparing features extraction algorithm classified by multi-class support vector machine
CN112465054B (en) FCN-based multivariate time series data classification method
CN113255666A (en) Personalized question answering system and method based on computer vision
CN113591607A (en) Station intelligent epidemic prevention and control system and method
Liao et al. Video Face Detection Technology and Its Application in Health Information Management System
CN114724217B (en) SNN-based edge feature extraction and facial expression recognition method
CN112070023B (en) Neighborhood prior embedded type collaborative representation mode identification method
Momin et al. Recognizing facial expressions in the wild using multi-architectural representations based ensemble learning with distillation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination