CN114639138A - Newborn pain expression recognition method based on generation of confrontation network - Google Patents

Newborn pain expression recognition method based on generation of confrontation network Download PDF

Info

Publication number
CN114639138A
CN114639138A CN202210147904.1A CN202210147904A CN114639138A CN 114639138 A CN114639138 A CN 114639138A CN 202210147904 A CN202210147904 A CN 202210147904A CN 114639138 A CN114639138 A CN 114639138A
Authority
CN
China
Prior art keywords
pain
image
generator
face
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210147904.1A
Other languages
Chinese (zh)
Inventor
潘赟
赵益晟
朱怀宇
陈朔晖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202210147904.1A priority Critical patent/CN114639138A/en
Publication of CN114639138A publication Critical patent/CN114639138A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Abstract

A method for recognizing the pain expression of a newborn based on a generated confrontation network is characterized in that the generated confrontation network is constructed to learn how to recover a non-blocked face image with a correct posture from newborn face images with different postures and blocking; considering hidden variables of generators in the generation confrontation network as modified facial pain features; and constructing a residual error network combined with an attention mechanism to screen and analyze the facial pain characteristics corrected by the zero-sum game so as to further get rid of the influence of shielding and posture change and output an accurate pain level result. The method aims to improve the neonatal pain expression recognition accuracy in a real environment, enhance the robustness of the neonatal pain recognition method to shielding and the adaptability to posture change, optimize the extraction of pain characteristics according to the generation of an antagonistic network, and realize the screening of the pain characteristics through an attention mechanism, thereby effectively solving the problem of the neonatal pain expression recognition in the shielding and posture change environment.

Description

Newborn pain expression recognition method based on generation of confrontation network
Technical Field
The invention relates to the field of neonatal pain recognition, in particular to a neonatal pain recognition method based on facial expressions.
Background
Neonatal pain not only causes physiological reactions, but also causes a series of short-term or long-term adverse reactions, such as growth retardation, permanent central nervous system injury, mood disorders, and even increases the risk of future diseases. Therefore, developing an automated neonatal pain recognition algorithm to achieve continuous, objective pain assessment is of great importance to neonatal pain management and healthy growth. In the aspect of automatic identification of neonatal pain, studies have been made, such as chinese patent application "a method for identifying neonatal pain expression based on a two-channel three-dimensional convolutional neural network" (patent application No. CN201810145292.6, publication No. CN108363979A), "a method and system for identifying neonatal pain expression based on a deep 3D residual network" (patent application No. CN201810346075.3, publication No. CN108596069A), "a method for identifying neonatal pain expression based on a dual-channel convolutional neural network" (patent application No. CN201910748936.5, publication No. CN111401117A), however, existing studies including the above patent applications only consider neonatal pain expression identification in an ideal environment (controlled environment), and these studies have made a breakthrough progress by using a method based on deep learning with respect to an unobstructed and posture-correct neonatal facial image. However, in a real environment (uncontrolled environment), due to the existence of factors of occlusion and variable head postures, great challenges are brought to the neonatal pain expression recognition, and a key problem to be solved at present is how to improve the neonatal pain expression recognition accuracy rate in the real environment and enhance the robustness of the neonatal pain recognition method to occlusion and the adaptability to posture changes.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a method for neonatal expression pain robust to face shielding and posture change in a real scene in consideration of the lack of a method for neonatal expression pain robust to face shielding and posture change at present.
The purpose of the invention can be realized by the following technical scheme:
a method for recognizing a neonatal pain expression based on generation of an antagonistic network, the method comprising the steps of:
step S1: restoring a non-blocked face image with a correct posture from face images of newborns with different postures and possibly blocked according to the generated countermeasure network;
step S2: generating eigenvectors (vector) of generators in the antagonistic network as corrected facial pain features for subsequent pain analysis;
step S3: and screening and analyzing the corrected facial pain features by using a residual error network combined with an attention mechanism so as to further get rid of the interference of shielding and posture change and output an accurate pain level result.
Further, the process of step S1 is:
the generation countermeasure network consists of a generator and a discriminator, wherein the generator is responsible for generating a modified face image on the basis of an input face image; the discriminator is responsible for learning and distinguishing images generated by the generator and ideal face images which are not shielded and have correct postures in the guide set, the guide set g consists of all the ideal face images in the training set, and the generator continuously improves the capability of the generator for converting input images into the non-shielded and correct face images through the zero sum game of the generator and the discriminator; in the process of the zero-sum game, the training of the generator and the discriminator is carried out by four loss functions, the parameters of the generator and the discriminator are adjusted to be optimal through an error back propagation algorithm, and the four loss functions are specifically as follows:
(1) loss of symmetry function
Symmetry is an inherent feature of a normal face, and the symmetry loss is calculated as:
Figure BDA0003509115890000021
where H and W represent the height and width of the image, (n, m) represent the pixels of the image, |, represents an absolute value. Real-world images may not have absolute symmetry at the pixel level, and therefore, it is decided to minimize the loss of symmetry in laplacian space;
(2) antagonism loss function
The discriminator network acts as a supervisor and is responsible for distinguishing the generated face image from the ideal image and training the face image and the ideal image simultaneously with the generator, and the discriminator is trained by the following cross entropy loss function:
LGAN-Dis(gi,x′j)=-log(Dis(gi))-log(1-Dis(x′j))
wherein g isiRepresenting guide set image, GAN-Dis representing discriminator, x'jIs an image generated by the generator;
for the generator, the way the antagonism loss function is computed is:
LGAN-Gen(x′j)=-log(Dis(x′j))
wherein GAN-Gen represents a generator and Dis represents a discriminator;
(3) identity retention loss function
Preserving identity is a key part of ideal face generation, perception loss is adopted, the perception similarity is kept, so as to help the pain feature correction module to obtain the identity preservation capability, and a loss function is calculated based on feature maps output by the last two layers in the open-source Light CNN:
Figure BDA0003509115890000031
wherein Hl,WlIs the height and width of the last l-th layer feature map, Ω represents the feature map, | · | represents the absolute value, the identity retention loss aims to make the generated image and the original image have a smaller distance in the depth feature space, and it is considered that Light CNN can classify thousands of identities after being pre-trainedIt is considered that the most important human face structure or feature can be captured for identity recognition;
(4) total Variation regularization
To improve the spatial smoothness of the generated image and reduce the spike artifacts, a Total Variation regularizer is employed, which is defined as follows:
Figure BDA0003509115890000032
where H and W represent the height and width of the image, (n, m) represent the pixels of the image, x' is the image generated by the generator, | · | represents an absolute value.
The process of step S3 is:
screening and analyzing facial pain characteristics after zero-sum game correction by using a residual error network combined with an attention mechanism, wherein for the network structure of the part, a conventional residual error structure is adopted as a main body, and the core part is the attention mechanism combined with the main body; an attention branch parallel to a residual error branch is constructed, and based on a bottom-up top-down structure, an attention mask with the same size as a residual error structure feature map can be output to perform soft weighting on facial features in the residual error structure; in the bottom-up top-down structure, "down-sampling" is achieved by a series of convolution operations and pooling operations, while "up-sampling" is achieved by a deconvolution operation; the attention mask output from the attention mechanism can be used as a feature selector in forward propagation and also can be used as a filter in backward gradient updating, and the gradient calculation mode of the facial features under the action of the attention mask is as follows:
Figure BDA0003509115890000041
where M represents the attention branch, T represents the residual branch, σ represents the attention branch parameter, and φ is the residual branch parameter.
The beneficial effects of the invention are as follows: in a real environment (uncontrolled environment), the face of a newborn is often blocked or changed in posture, so that an invisible face area is generated, which brings a great challenge to the pain recognition of the newborn. The existing method only considers the pain recognition of the face image of the neonate without shielding and with correct posture at present, therefore, a semi-supervised learning mode is used for training and generating how to restore the face image without shielding and with correct posture from the face image of the neonate with different postures and shielding, so that facial pain expression characteristics which are slightly influenced by shielding and posture changes are obtained, namely, eigenvectors (latent vectors) of a generator in the antagonistic network are generated, and in addition, the further screening and filtering of a subsequent attention mechanism are carried out, so that the intrinsic neonatal pain expression information is finally obtained to complete the expression recognition of the neonatal pain, and the problem of the neonatal pain expression recognition under the shielding and posture change environment is effectively solved.
Drawings
Fig. 1 is a flow chart of a method of neonatal pain expression recognition based on generation of an antagonistic network;
FIG. 2 is an illustration of a pain signature correction module in a pain recognition model;
fig. 3 is an illustration of a pain level classification module in a pain recognition model.
Detailed Description
The technical solution of the method of the present invention will be described clearly and completely with reference to the accompanying drawings, and it is obvious that the described embodiments are some, not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.
Referring to fig. 1 to 3, a method for recognizing a neonatal pain expression based on generation of an antagonistic network, the method comprising the steps of:
step S1: restoring a non-blocked face image with a correct posture from face images of newborns with different postures and possibly blocked according to the generated countermeasure network; the process is as follows:
the generation countermeasure network consists of a generator and a discriminator, wherein the generator is responsible for generating a modified face image on the basis of an input face image; the discriminator is responsible for learning and distinguishing images generated by the generator and ideal face images which are not shielded and have correct postures in the guide set, the guide set g consists of all the ideal face images in the training set, and the generator continuously improves the capability of the generator for converting input images into the non-shielded and correct face images through the zero sum game of the generator and the discriminator; in the process of the zero-sum game, the training of the generator and the discriminator is carried out by four loss functions, the parameters of the generator and the discriminator are adjusted to be optimal through an error back propagation algorithm, and the four loss functions are specifically as follows:
(1) loss of symmetry function
Symmetry is an inherent feature of a normal face, and the symmetry loss is calculated as:
Figure BDA0003509115890000051
where H and W represent the height and width of the image, (n, m) represent the pixels of the image, |, represents an absolute value. Real-world images may not have absolute symmetry at the pixel level, and therefore, it is decided to minimize the loss of symmetry in laplacian space;
(2) antagonism loss function
The discriminator network acts as a supervisor and is responsible for distinguishing the generated face image from the ideal image and training the face image and the ideal image simultaneously with the generator, and the discriminator is trained by the following cross entropy loss function:
LGAN-Dis(gi,x′j)=-log(Dis(gi))-log(1-Dis(x′j))
wherein g isiRepresenting guide set image, GAN-Dis representing discriminator, x'jIs an image generated by the generator;
for the generator, the way the antagonism loss function is computed is:
LGAN-Gen(x′j)=-log(Dis(x′j))
wherein GAN-Gen represents a generator and Dis represents a discriminator;
(3) identity retention loss function
Preserving identity is a key part of ideal face generation, perception loss is adopted, the perception similarity is kept, so as to help the pain feature correction module to obtain the identity preservation capability, and a loss function is calculated based on feature maps output by the last two layers in the open-source Light CNN:
Figure BDA0003509115890000061
wherein Hl,WlThe height and the width of the characteristic graph of the last l layer are shown, omega represents the characteristic graph, and | represents an absolute value, the identity retention loss aims to enable the generated image and the original image to have a smaller distance in a depth characteristic space, and considering that Light CNN can classify thousands of identities after being pre-trained, the Light CNN can capture the most important face structure or characteristic for identity recognition;
(4) total Variation regularization
To improve the spatial smoothness of the generated image and reduce the spike artifacts, a Total Variation regularizer is employed, which is defined as follows:
Figure BDA0003509115890000062
where H and W represent the height and width of the image, (n, m) represent the pixels of the image, x' is the image generated by the generator, | · | represents an absolute value.
Step S2: generating eigenvectors (vector) of generators in the antagonistic network as corrected facial pain features for subsequent pain analysis;
step S3: screening and analyzing the corrected facial pain characteristics by using a residual error network combined with an attention mechanism so as to further get rid of the interference of shielding and posture change and output an accurate pain level result, wherein the process comprises the following steps:
screening and analyzing facial pain characteristics after zero-sum game correction by using a residual error network combined with an attention mechanism, wherein for the network structure of the part, a conventional residual error structure is adopted as a main body, and the core part is the attention mechanism combined with the main body; an attention branch parallel to a residual error branch is constructed, and based on a bottom-up top-down structure, an attention mask with the same size as a residual error structure feature map can be output to perform soft weighting on facial features in the residual error structure; in the bottom-up top-down structure, "down-sampling" is achieved by a series of convolution operations and pooling operations, while "up-sampling" is achieved by a deconvolution operation; the attention mask output from the attention mechanism can be used as a feature selector in forward propagation and also can be used as a filter in backward gradient updating, and the gradient calculation mode of the facial features under the action of the attention mask is as follows:
Figure BDA0003509115890000071
where M represents the attention branch, T represents the residual branch, σ represents the attention branch parameter, and φ is the residual branch parameter.
The implementation process of the neonatal pain expression recognition method based on generation of the antagonistic network in the embodiment is as follows:
1): acquiring a neonate image set under a real environment, wherein the neonate image set comprises a pain level label, and the process is as follows:
the handheld electronic equipment is used for recording videos with the recording duration of one minute in a neonatal ward to serve as original data, and in order to ensure authenticity and clinical value of the data, video recording is only carried out when clinical pain-causing operation occurs, and it is guaranteed that the handheld electronic equipment is not moved to follow posture changes of a neonate, and blocking is not limited. After obtaining enough raw video data, selecting nurses with specialized training and pain assessment experience to perform pain level assessment meeting clinical criteria; selecting key frames from each video to form a neonate image set, evaluating the pain level of the neonate according to a neonate pain scale NIPS, generating four pain states based on the NIPS score as pain level labels of the neonate image set, namely no pain (NIPS score: 0-1 point), mild pain (NIPS score: 2-3 point), moderate pain (NIPS score: 4-5 point) and severe pain (NIPS score: 6-7 point), and dividing the data set into two subsets according to whether the human face is occluded by other objects (such as medical equipment or limbs of the neonate): "with occlusion" and "without occlusion", then the open source method in OpenCV estimates the face pose of the neonate and obtains Tait Bryan angles (pitch, yaw, roll). When the "pitch" angle is greater than 30 ° or the "yaw" angle is greater than 45 °, the neonatal facial pose is considered to be of a non-ideal type, whereby the data set is further divided into four subsets, the test sets in each subset are determined in the usual ratio of 7:3 and combined into a complete test set;
2): preprocessing a newborn image set, specifically comprising face detection, cutting and alignment; the process is as follows:
the method includes the steps that a newborn face image is cut out from a newborn image by using an open source ZFace, the ZFace can output 49 face mark points and face boundary points, and after the newborn face image is obtained, uniform alignment processing is carried out, specifically, plane deflection (rolling angle) of a face is easy to align compared with yaw angle (yaw angle) and pitch angle (pitch angle), so that the face images are unified into a vertical state by affine transformation, and two-dimensional face images are aligned by linear transformation under the action of an affine matrix M. Specifically, the affine transformation matrix M is dynamically acquired by calculating the coordinates of the key points in the original image and using the correspondence with the coordinates of the key points in the reference face. The calculation process is as follows:
Figure BDA0003509115890000081
wherein, a1、b1、c1、a2、b2、c2Respectively representing the values to be determined in a three-dimensional affine matrix M, (a, beta) as original coordinates, (u, v) as transformed seatsAnd (4) marking. The three links of face detection, cutting and alignment in the preprocessing operation can be replaced by other related algorithms.
3): constructing a pain recognition model, wherein the model comprises a pain characteristic correction module and a pain level classification module; the process is as follows:
as shown in fig. 2, a pain feature correction module is built based mainly on generating an antagonistic network. The generation countermeasure network mainly comprises a generator and a discriminator, wherein the discriminator is responsible for learning and distinguishing images generated by the generator and a guide set, and the guide set g comprises all ideal face images in a training set. Inspired by TP-GAN, producers in a designed generative confrontation network have two paths, focusing on global shape and local detail, respectively. For the design of the local path, five pain-related facial landmarks, i.e., left eye, right eye, nose, upper mouth, and lower mouth, were first detected using the open-source MTCNN. Five regions cut out from the five facial markers as centers are input to five generators in the local path; the global path is designed more conventionally, only one generator is used, the information fusion strategy between the two paths is similar to TP-GAN, a denoising auto-encoder (DAE) is selected as the generator, and the denoising auto-encoder is characterized by receiving an image damaged by some form of noise and realizing noise removal by requiring the output image to be similar to an ideal version of the original image. Input x (face image) can be considered as an ideal face image that has been corrupted by pose changes and partial occlusion, "denoising" is achieved by requiring DAE to learn how to bring the output image as close as possible to the guide set; four loss functions are used simultaneously to take advantage of the generation of the countermeasure network and DAE, and are expressed as follows:
(1) loss of symmetry function
Symmetry is an inherent feature of a normal face, and the symmetry loss is calculated as:
Figure BDA0003509115890000091
where H and W represent the height and width of the image, (n, m) represent the pixels of the image, | · | represents an absolute value, real-world images may not have symmetry at the pixel level, and therefore, it is decided to minimize the loss of symmetry in laplace space;
(2) antagonism loss function
The discriminator network acts as a supervisor and is responsible for distinguishing the generated face image from the ideal image and training the face image and the ideal image simultaneously with the generator, and the discriminator is trained by the following cross entropy loss function:
LGAN-Dis(gi,x′j)=-log(Dis(gi))-log(1-Dis(x′j))
wherein g isiRepresenting guide set image, GAN-Dis representing discriminator, x'jIs an image generated by the generator;
for the generator, the way the antagonism loss function is computed is:
LGAN-Gen(x′j)=-log(Dis(x′j))
wherein GAN-Gen represents a generator and Dis represents a discriminator;
(3) identity preserving function
Preserving identity is a key part of ideal face generation, and adopts perceptual loss aiming at maintaining perceptual similarity to help the pain feature correction module to obtain identity preservation capability, specifically, a loss function is calculated based on feature maps output by the last two layers in the open-source Light CNN:
Figure BDA0003509115890000101
wherein Hl,WlThe height and the width of the last l-th layer feature map are represented, omega represents the feature map, and | represents an absolute value, the identity retention loss aims to enable the generated image and the original image to have a smaller distance in a depth feature space, and in consideration of the fact that Light CNN can classify thousands of identities after being pre-trained, the Light CNN can capture the most important face structure or feature for identity recognition;
(4) total Variation regularization
To improve the spatial smoothness of the generated image and reduce the spike artifacts, a Total Variation regularizer is employed, which is defined as follows:
Figure BDA0003509115890000102
where H and W represent the height and width of the image, (n, m) represent the pixels of the image, x' is the image generated by the generator, | · | represents an absolute value;
the pain level classification module is responsible for analyzing the corrected pain features to output a final pain level classification result, and the eigenvector (vector) z (output variable of an encoder in the generator) of the generator in the pain feature correction module is regarded as the corrected pain features and is input into the pain level classification module. As shown in fig. 3, for the network structure of this portion, a residual structure is used as a main trunk, and an attention mechanism is added, so that an attention branch parallel to the residual branch is constructed, and based on a bottom-up top-down structure, an attention mask with the same size can be output, and soft weighting is performed on the face features; in the bottom-up top-down structure, "down-sampling" is achieved by a series of convolutions and pooling, while "up-sampling" is achieved by deconvolution; the attention mask output from the attention mechanism can be used as a feature selector in forward propagation and also can be used as a filter in backward gradient updating, and the gradient calculation mode of the facial features under the action of the attention mask is as follows:
Figure BDA0003509115890000111
where M represents an attention branch, T represents a residual branch, σ represents an attention branch parameter, and φ is a residual branch parameter;
4): training and testing the constructed pain recognition model by using a neonatal pain image set; the process is as follows:
training the constructed neonatal pain recognition model by using a training set of a neonatal image set, wherein for the training of the pain correction module, the loss function calculation mode of a generator is as follows:
LGen=λ1Ltv2Lid3Lsym+ηLGAN-Gen
wherein λ1、λ2And λ3The generator receives error signals from the discriminator, so that a parameter eta is used as the weight of the antagonism loss function, and the specific values of the parameters are as follows: lambda [ alpha ]1=5×10-3;λ2=3×10-2;λ30.3; η is 0.1. For the training of the discriminators only the antagonism loss function, i.e. L, is usedGAN-Dis. Furthermore, occlusion removal for the local path output image is enhanced in view of the difficulty of completely removing the occlusion in the real scene and the importance of the five facial markers for pain assessment. Specifically, after the training of the pain correction module is completed once, new generation countermeasure networks are respectively constructed for 5 DAEs in the local path, namely, the output image of the local path and the image of the corresponding position in the guidance centralization are input into a new discriminator to realize the countermeasure training of the DAEs in the local path, after the training of the pain correction module is completely completed, the training of a pain level classification module is started, the parameters of the module are adjusted to be optimal through an error back propagation algorithm, a constraint term based on attention weight is added to a loss function of the pain level classification module on the basis of a mean square error loss function, wherein a gradient calculation mode under the action of attention mask is described in detail in step S3, and for a test sample in the neonatal image set, the neonatal pain recognition model is trained to perform neonatal pain recognition on the neonatal pain recognition, to obtain its corresponding pain level.
Based on the method, the invention is verified on a neonatal expression set, the comparison of the performance of the invention with that of other neonatal pain expression recognition methods is shown in table 1, as shown in table 1, our invention shows significant performance advantages in the face of occlusion and posture change, and in addition, ablation experiments are performed on the improvement effect of attention branches on the neonatal pain recognition accuracy, and the results are shown in table 2.
Figure BDA0003509115890000121
TABLE 1
Figure BDA0003509115890000122
TABLE 2
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (3)

1. A method for recognizing a neonatal pain expression based on generation of an antagonistic network, the method comprising the steps of:
step S1: restoring an unobstructed face image with a correct posture from face images of newborns with different postures and possible occlusion according to the generation countermeasure network;
step S2: generating eigenvectors of generators in the antagonistic network as modified facial features for subsequent pain analysis;
step S3: and screening and analyzing the corrected facial features by using a residual error network combined with an attention mechanism so as to further get rid of the interference of shielding and posture change and output an accurate pain level result.
2. The method for recognizing a neonatal pain expression based on generation of an antagonistic network in accordance with claim 1, wherein the step S1 is performed by:
the generation countermeasure network consists of a generator and a discriminator, wherein the generator is responsible for generating a modified face image on the basis of an input face image; the discriminator is responsible for learning and distinguishing images generated by the generator and ideal face images which are not shielded and have correct postures in the guide set, the guide set g consists of all the ideal face images in the training set, and the generator continuously improves the capability of the generator for converting input images into the non-shielded and correct face images through the zero sum game of the generator and the discriminator; in the process of the zero-sum game, the training of the generators and the discriminators is carried out by four loss functions, the parameters of the generators and the discriminators are adjusted to be optimal through an error back propagation algorithm, and the four loss functions are as follows:
(1) loss of symmetry function
Symmetry is an inherent feature of a normal face, and the symmetry loss is calculated as:
Figure FDA0003509115880000011
where H and W represent the height and width of the image, (n, m) represent the pixels of the image, |, represents an absolute value. Real-world images do not have absolute symmetry at the pixel level, and therefore, it is decided to minimize the loss of symmetry in laplacian space;
(2) antagonism loss function
The discriminator network acts as a supervisor and is responsible for distinguishing the generated face image from the ideal image and training with the generator, and the discriminator is trained by the following cross entropy loss function:
LGAN-Dis(gi,x′j)=-log(Dis(gi))-log(1-Dis(x′j))
wherein g isiRepresenting guide set image, GAN-Dis representing discriminator, xj' is the image generated by the generator;
for the generator, the way the antagonism loss function is computed is:
LGAN-Gen(x′j)=-log(Dis(x′j))
wherein GAN-Gen represents a generator and Dis represents a discriminator;
(3) identity retention loss function
Preserving identity is a key part of ideal face generation, perception loss is adopted, the perception similarity is kept, so as to help the pain feature correction module to obtain the identity preservation capability, and a loss function is calculated based on feature maps output by the last two layers in the open-source Light CNN:
Figure FDA0003509115880000012
wherein Hl,WlThe height and the width of the characteristic graph of the last l layer are shown, omega represents the characteristic graph, and | represents an absolute value, the identity retention loss aims to enable the generated image and the original image to have a smaller distance in a depth characteristic space, and considering that Light CNN can classify thousands of identities after being pre-trained, the Light CNN can capture the most important face structure or characteristic for identity recognition;
(4) total Variation regularization
To improve the spatial smoothness of the generated image and reduce the spiking artifacts, a Total Variation regularizer is employed, which is defined as follows:
Figure FDA0003509115880000021
where H and W represent the height and width of the image, (n, m) represent the pixels of the image, x' is the image generated by the generator, | · | represents an absolute value.
3. The method for recognizing a neonatal pain expression based on generation of an antagonistic network according to claim 1 or 2, wherein the step S3 is performed by:
screening and analyzing facial pain characteristics after zero sum game correction by using a residual error network combined with an attention mechanism, constructing an attention branch parallel to the residual error branch, outputting an attention mask with the same size as a characteristic diagram in the residual error structure based on a bottom-up top-down structure, and carrying out soft weighting on the facial characteristics in the residual error structure; in the bottom-up top-down structure, "down-sampling" is achieved by a series of convolution operations and pooling operations, while "up-sampling" is achieved by a deconvolution operation; the attention mask output by the attention mechanism can be used as a feature selector in forward propagation and also can be used as a filter in backward gradient updating, and the gradient of the facial feature under the action of the attention mask is calculated in the following way:
Figure FDA0003509115880000022
where M represents the attention branch, T represents the residual branch, σ represents the attention branch parameter, and φ is the residual branch parameter.
CN202210147904.1A 2022-02-17 2022-02-17 Newborn pain expression recognition method based on generation of confrontation network Pending CN114639138A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210147904.1A CN114639138A (en) 2022-02-17 2022-02-17 Newborn pain expression recognition method based on generation of confrontation network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210147904.1A CN114639138A (en) 2022-02-17 2022-02-17 Newborn pain expression recognition method based on generation of confrontation network

Publications (1)

Publication Number Publication Date
CN114639138A true CN114639138A (en) 2022-06-17

Family

ID=81945574

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210147904.1A Pending CN114639138A (en) 2022-02-17 2022-02-17 Newborn pain expression recognition method based on generation of confrontation network

Country Status (1)

Country Link
CN (1) CN114639138A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114842681A (en) * 2022-07-04 2022-08-02 中国电子科技集团公司第二十八研究所 Airport scene flight path prediction method based on multi-head attention mechanism

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114842681A (en) * 2022-07-04 2022-08-02 中国电子科技集团公司第二十八研究所 Airport scene flight path prediction method based on multi-head attention mechanism

Similar Documents

Publication Publication Date Title
CN111652827B (en) Front face synthesis method and system based on generation countermeasure network
Theagarajan et al. Soccer: Who has the ball? Generating visual analytics and player statistics
US9042606B2 (en) Hand-based biometric analysis
CN105869166B (en) A kind of human motion recognition method and system based on binocular vision
CN111460976B (en) Data-driven real-time hand motion assessment method based on RGB video
CN108875586B (en) Functional limb rehabilitation training detection method based on depth image and skeleton data multi-feature fusion
Kantarcı et al. Thermal to visible face recognition using deep autoencoders
Aydogdu et al. Comparison of three different CNN architectures for age classification
Czyzewski et al. Chessboard and chess piece recognition with the support of neural networks
CN109325408A (en) A kind of gesture judging method and storage medium
CN117095128A (en) Priori-free multi-view human body clothes editing method
Wang et al. Age-oriented face synthesis with conditional discriminator pool and adversarial triplet loss
CN114639138A (en) Newborn pain expression recognition method based on generation of confrontation network
CN108154176A (en) A kind of 3D human body attitude algorithm for estimating for single depth image
Lim et al. Pose transforming network: Learning to disentangle human posture in variational auto-encoded latent space
Mallis et al. From keypoints to object landmarks via self-training correspondence: A novel approach to unsupervised landmark discovery
CN112288645A (en) Skull face restoration model construction method, restoration method and restoration system
CN114399731B (en) Target positioning method under supervision of single coarse point
CN115841602A (en) Construction method and device of three-dimensional attitude estimation data set based on multiple visual angles
Wang Three-Dimensional Image Recognition of Athletes' Wrong Motions Based on Edge Detection.
Kumar et al. A pragmatic approach to face recognition using a novel deep learning algorithm
CN111241870A (en) Terminal device and face image recognition method and system thereof
Henderson et al. Encoding Kinematic and Temporal Gait Data in an Appearance-Based Feature for the Automatic Classification of Autism Spectrum Disorder
Mudassar et al. FocalNet-Foveal Attention for Post-processing DNN Outputs
Jiang Application of Rotationally Symmetrical Triangulation Stereo Vision Sensor in National Dance Movement Detection and Recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination