CN113191171B

CN113191171B - Pain intensity evaluation method based on feature fusion

Info

Publication number: CN113191171B
Application number: CN202010034499.3A
Authority: CN
Inventors: 卿粼波; 黄义波; 何小海; 滕奇志; 王露
Original assignee: Sichuan University
Current assignee: Sichuan University
Priority date: 2020-01-14
Filing date: 2020-01-14
Publication date: 2022-06-17
Anticipated expiration: 2040-01-14
Also published as: CN113191171A

Abstract

The invention provides a pain intensity assessment method based on feature fusion, and mainly relates to pain intensity regression of face image data by using a three-dimensional convolutional neural network (3D CNN) and a two-dimensional convolutional neural network (2D CNN). The method comprises the following steps: and carrying out affine transformation on the face data in the original data set to obtain image data with a fixed shape and without background information. And then, extracting dynamic time sequence features in the image sequence by using the 3D CNN, extracting static space features in the image by using the 2D CNN, fusing the features extracted in the two modes by a feature map splicing mode to serve as the input of the subsequent 3D CNN, and finally predicting the pain intensity by a regression mode. The method gives full play to the advantages of deep learning, and utilizes the convolutional neural networks with different dimensions to respectively extract the dynamic time sequence information and the static space information of the input data, thereby improving the accuracy of pain intensity regression.

Description

Pain intensity evaluation method based on feature fusion

Technical Field

The invention relates to a pain intensity assessment problem in the field of deep learning, in particular to a pain intensity assessment method based on feature fusion.

Background

The advent of visual sensor data has created opportunities in many areas. Particularly in the field of clinical medicine, the health state information of the patient can be obtained by analyzing the visual sensing image data, and scientific basis is provided for the design of a related prototype system. By estimating the pain intensity of the patient from the visual sensing image data by using a deep learning method, a quantitative value of the pain intensity of the patient can be provided for medical staff.

The evaluation of pain intensity is an important research content in the field of computer vision, and is widely concerned by researchers at home and abroad, and the evaluation of pain intensity is also an important component of facial emotion recognition, so that the evaluation method has great research value. Currently, recent advances in deep learning provide opportunities for processing large volumes of medical visual sensory data. The patient image data acquired by the visual sensor is analyzed by utilizing a deep learning method, so that a foundation can be provided for constructing a corresponding intelligent system. Whereas existing pain intensity assessment datasets are video sequences taken from the facial expression of the patient. Thus, the patent uses a dynamic video sequence to assess the patient's facial pain intensity.

Deep Learning (Deep Learning) is a research field which is concerned in recent years, and combines a plurality of abstract data processing layers to form a calculation model to replace a traditional method for manually selecting features, so that a machine can autonomously learn the features of a data sample, and the defect of manually selecting the features is effectively avoided. Compared with the manual feature selection, the deep learning method utilizes a large amount of data to learn the features, and feature information describing the data can be described more abundantly. In short, deep learning, both in terms of recognition time and accuracy, is a great improvement over conventional methods.

Disclosure of Invention

The invention aims to provide a pain intensity evaluation method based on feature fusion, which introduces a three-dimensional convolutional neural network (3D CNN) and a two-dimensional convolutional neural network (2D CNN) in deep learning, respectively extracts dynamic time sequence features and static space features of input data, and then fuses the features extracted in the two modes. The problems that parameter adjustment of current shallow learning is difficult, accuracy is low and the like are effectively solved.

For convenience of explanation, the following concepts are first introduced:

two-dimensional Convolutional Neural Network (2-proportional Convolutional Neural Network,2D CNN): the convolutional neural network is designed based on the inspiration of a visual neural mechanism, is a multilayer feedforward neural network, each layer is composed of a plurality of two-dimensional planes, each neuron on each plane works independently, and the convolutional neural network mainly comprises a feature extraction layer and a feature mapping layer. The method is mainly used for feature extraction of a single image.

Three-dimensional Convolutional Neural Network (3-proportional Convolutional Neural Network,3D CNN): similar to a two-dimensional convolutional neural network, the method is mainly used for feature extraction of image sequences.

Feature Map (Feature Map): after the input data passes through the convolutional neural network, a multi-dimensional output is generated, and the output is called a feature map.

Feature fusion: and fusing the static spatial features extracted by the 2D CNN and the dynamic time sequence features extracted by the 3D CNN in a feature map splicing mode.

The invention specifically adopts the following technical scheme:

a pain intensity evaluation method based on feature fusion is provided, and the method is mainly characterized in that:

a. carrying out affine transformation on original image data and cutting to obtain a face image sequence without background information;

b. inputting the face image sequence obtained in the step a into a 3D CNN by taking 16 frames as a sample to extract time sequence features in the image sequence, and inputting the last frame of the sample into a 2D CNN to extract static spatial features of the image;

c. fusing the two characteristics obtained in the step b, inputting the obtained characteristics into a 3D CNN for higher-level characteristic learning, and obtaining a final pain intensity evaluation network model;

the method mainly comprises the following steps:

(1) preprocessing the used pain data set, wherein all faces in an original image sequence are fixed to the same shape through face affine transformation, background information in the original image data is removed, an image only containing face information is obtained, a face area is cut off, and the resolution ratio is set to be 112 x 112, so that a preprocessed pain data set is obtained;

(2) respectively extracting dynamic time sequence characteristics and static space characteristics in an input sample by using the 3D CNN and the 2D CNN;

(3) and (3) fusing the dynamic time sequence characteristics and the static spatial characteristics respectively extracted by the 3D CNN and the 2D CNN in the step (2), inputting the fused characteristics into the 3D CNN for higher-level characteristic learning, and finally performing regression prediction to obtain a final pain intensity prediction regression network model.

The invention has the beneficial effects that:

(1) the advantage of self-learning in the deep learning is fully developed, the machine can automatically learn the image characteristics, the problem of deviation and low efficiency of manually selecting the characteristics is effectively avoided, and the adaptive capacity is stronger.

(2) The dynamic time sequence characteristic of the input sample is fully extracted by using the 3D CNN, the static space characteristic of the sample is fully extracted by using the 2D CNN, the complementary advantages of the static characteristic and the dynamic characteristic are effectively combined, and the accuracy of the training effect is improved.

(3) The two features are fused together by means of feature map splicing and input into a subsequent 3D CNN for higher-level feature learning, and the final classification effect is improved.

(4) The deep learning and the pain intensity evaluation are combined, the problem that the accuracy rate of the traditional method is low is solved, and the research value is improved.

Drawings

FIG. 1 is an affine transformation process of a face image sequence in the present invention.

Fig. 2 shows the complete network structure of the present invention.

FIG. 3 illustrates the fusion of two different features of the present invention.

Detailed Description

The present invention is further described in detail with reference to the drawings and examples, it should be noted that the following examples are only for illustrating the present invention and should not be construed as limiting the scope of the present invention, and those skilled in the art should be able to make certain insubstantial modifications and adaptations to the present invention based on the above disclosure and should still fall within the scope of the present invention.

In fig. 2, the pain intensity assessment method based on feature fusion specifically includes the following steps:

(1) the original pain data set was affine transformed in the manner shown in fig. 1, and the resulting face image resolution was set to 112 × 112.

(2) The method comprises the steps of respectively extracting dynamic time sequence characteristics and static space characteristics of an image by using 3D CNN and 2D CNN, using 16 continuous frames in preprocessed data as a sample, extracting the dynamic time sequence characteristics in the sample by using the 3D CNN and extracting the static space characteristics of the last frame in the sample by using the 2D CNN, fusing two different characteristics together by using a characteristic diagram fusion mode of FIG. 3, sending the fused characteristics into the subsequent 3D CNN for higher-level characteristic learning, and finally realizing the evaluation of pain intensity.

(3) Training: the network structure given in fig. 2 is used for training. As used herein, a data set contains 25 objects in total, and thus the training strategy is to use the data contained in one of the objects as a test set, the data contained in the other 24 objects as a training set, and so on until all the objects have been used as test sets.

(4) After step (3), 25 sets of experimental results are obtained, and the final predicted result is obtained after averaging.

Claims

1. A pain intensity assessment method based on feature fusion is characterized in that:

a. carrying out affine transformation on the original image data and cutting to obtain a face image sequence without background information;

the method mainly comprises the following steps:

2. The method according to claim 1, wherein the step (1) comprises fixing all images in the original image sequence to the same shape by using human face affine transformation and removing background information in the original image data to obtain an image containing only human face information.

3. The method according to claim 1, wherein the 3D CNN and the 2D CNN are used to extract dynamic temporal features and static spatial features in the input sample in step (2).

4. The method according to claim 1, wherein the dynamic temporal features and the static spatial features extracted from the 3D CNN and the 2D CNN respectively are fused in step (3), and the fused features are input into the subsequent 3D CNN for higher-level feature learning.