CN108596069A - Neonatal pain expression recognition method and system based on depth 3D residual error networks - Google Patents
Neonatal pain expression recognition method and system based on depth 3D residual error networks Download PDFInfo
- Publication number
- CN108596069A CN108596069A CN201810346075.3A CN201810346075A CN108596069A CN 108596069 A CN108596069 A CN 108596069A CN 201810346075 A CN201810346075 A CN 201810346075A CN 108596069 A CN108596069 A CN 108596069A
- Authority
- CN
- China
- Prior art keywords
- convolutional layer
- convolution
- residual
- branch
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
Abstract
The invention discloses a kind of neonatal pain expression recognition methods and system based on depth 3D residual error networks.This method includes:The newborn's expression video library for including pain expression class label is established, and the sample in newborn's expression video library is divided into training set and verification collection;Build a kind of depth 3D residual error networks for neonatal pain Expression Recognition, pre-training is carried out to network using the disclosed extensive video database for having class label, obtain initial weight parameter value, it recycles the training set in newborn's expression video library and verification collection sample to be finely adjusted network, obtains trained network model;Newborn's expression video segment to be tested is input to trained network model, expression classification recognition is carried out, obtains pain Expression Recognition result.The present invention extracts the time-space variation for capableing of reflecting time information using depth 3D residual error networks from video, the variation of facial expression can be preferably characterized, to promote the accuracy of Classification and Identification.
Description
Technical field
It is especially a kind of based on the new of depth 3D residual error networks the present invention relates to facial expression recognition and machine learning field
Raw youngster's pain expression recognition method and system.
Background technology
Scientific research proves that newborn has pain sensing capability.Neonatal pain is operated essentially from induced pain, packet
Include vola blood sampling, artery and vein puncture and trachea cannula, subcutaneous and intramuscular injection etc..Repeatedly or lasting pain stimulation is to newborn
Growth and development generate and a series of in the recent period and at a specified future date seriously affect, it will lead to that intelligent of neonatal is slow, nervous centralis
The harm such as system injury and Affective Disorder.Pain Assessment is an important ring for control pain, so correctly assessment pain and timely
Corresponding analgesia measure is taken, to mitigate neonatal pain, there is important clinical value, this is to improving China human mortality
Quality has far-reaching significance.
At present in clinical practice, artificial pain Assessment is carried out by specially trained medical staff.However, manually commenting
Estimate not only time and effort consuming, but also assessment result depends on the experience of medical staff, and is influenced by subjective factors such as personal moods.
Further, since the medical resource in China is distributed not perfectly flat weighing apparatus, it is relatively deficient in small city and remote countryside regional healthcare resource,
Especially lack the health care professional in terms of paediatrics, objective assessment can not be made to neonatal pain degree.Therefore, urgently
It needs to develop a computer assisted neonatal pain automatic evaluation system, provide assistance in diagnosis for parent and medical staff,
To take corresponding analgesia measure in time, mitigate neonatal pain.
In terms of neonatal pain automatically assessment, have some researchs, as " one kind being based on facial expression to Chinese patent application
The neonatal pain recognition methods of analysis " (number of patent application 201710628847.8, publication No. CN107491740A), by from
Facial dynamic geometry feature and facial dynamic texture feature are extracted in video sequence, carry out dimensionality reduction, classification after Fusion Features again.But
This method automatically will accurately extract characteristic parameter and extremely be not easy.Chinese patent " based on the neonatal pain of rarefaction representation with it is non-
Pain expression classification recognition method " (patent No. ZL201210077351.3) is built sparse using the feature vector of training sample
It indicates the excessively complete dictionary in model, test sample was regarded as to the linear combination of training sample in complete dictionary, utilize its spy
Some sparsities carry out pain and non-pain expression classification recognition.But this method needs are well-designed to meet sparsity constraints condition
Excessively complete dictionary, and two Classification and Identifications only are carried out to pain and two class expression of non-pain, pain degree are not assessed.
In order to automatically extract facial expression feature, the limitation and subjectivity of artificial design features, the present inventor are avoided
Some neonatal pain expression recognition methods based on neural network are proposed, such as " newborn's pain based on convolutional neural networks
Pain expression classification method " (number of patent application CN201611233381.3, publication No. CN106778657A), " one kind being based on depth
The neonatal pain expression recognition method of neural network " (number of patent application CN201710497593.0, publication No.
CN107392109A).Performance of the convolutional neural networks in image classification identification mission largely has benefited from deeper
Network model.However, if merely increasing network depth by stacking convolutional layer, when the convolution number of plies increases to some
After value, the accuracy rate of Classification and Identification can decline instead.On the other hand, face is extracted from still image using 2D convolutional neural networks
Portion's expressive features have ignored the behavioral characteristics of adjacent interframe in video, cannot characterize the variation of facial expression well.
In order to extract the feature in video in time-domain and spatial domain, Yi Zhongzhi simultaneously using deep neural network
The thinking connect is exactly that will be used to the 2D convolution that characteristics of image learns expand to be 3D convolution, while carrying out in room and time dimension
Convolution operation.In this way, the 3D convolutional neural networks being made of 3D convolution operations can while obtaining per frame image features,
The association and variation of consecutive frame over time can be expressed.However there is certain difficulty in practice in such design, first,
The memory that the introducing of time dimension keeps the number of parameters, run time and training of entire neural network required all increases substantially;Its
Secondary, the 3D convolution kernels of random initializtion need the video sample of a large amount of tape labels to be trained.
Invention content
Goal of the invention:It is a kind of based on depth 3D residual error networks present invention aims at providing for problem of the prior art
Neonatal pain expression recognition method and system, by residual unit structure be applied to depth convolutional neural networks, can be effective
Alleviate network model training when backpropagation in gradient disappearance problem, and then solve deep layer network be difficult to training and performance move back
The problem of change, meanwhile, 3D convolution operations are realized using the combination operation of 2D convolution sum 1D convolution, compared to the 2D of same depth
Convolutional neural networks only add a certain number of 1D convolution, not will produce in number of parameters, run time etc.
The growth of degree.
Technical solution:For achieving the above object, the technical solution adopted by the present invention is as follows:
A kind of neonatal pain expression recognition method based on depth 3D residual error networks, includes the following steps:
(1) newborn's expression video segment sample needed for acquisition, by each video clip be trimmed into one it is isometric
Frame sequence is established and includes newborn's expression video library of pain expression class label, and by the sample in newborn's expression video library
Originally it is divided into training set and verification collects;
(2) depth 3D residual error network of the structure applied to neonatal pain Expression Recognition, including be linked in sequence:Input
Layer, the first convolutional layer, the first pond layer, 3D residual errors sub-network, 2D residual errors sub-network, full articulamentum and Softmax classification layers;
Input layer is used for input video sequence, every frame image in video sequence is normalized;
First convolutional layer, the video sequence after the normalization exported to input layer using several 3D convolution kernels carry out convolution
Operation, exports several feature graphic sequences;
First pond layer, the output that the first convolutional layer is checked using 3D pondizations carry out the maximum pond of spatial domain and time-domain
Operation, exports several feature graphic sequences;
The 3D residual errors sub-network includes 3 kinds of 3D residual units with different structure of several alternate cycles connection, with
And it is interspersed in the pond layer in 3D residual unit connection paths;The 3D residual units of 3 kinds of different structures are all made of 2D convolution
Realize that the 3D convolution operations of spatial domain and time-domain, combination are respectively the string without shortcut branch with 1D convolution combination operations
Line mode, parallel mode and the serial mode with shortcut branch;
The 2D residual errors sub-network, including at least three being linked in sequence have mutually isostructural 2D residual units and 1 pond
Change layer;
The output of 2D residual error sub-networks is fully connected to the n output neuron of this layer by full articulamentum, exports a n dimension
Feature vector;
And Softmax classification layers, the feature vector for exporting full articulamentum are connected to corresponding expression classification entirely
N output node exports a n-dimensional vector, the number of each dimension represents the probability that input sample belongs to the category in vector,
Wherein n is expression class number;
(3) have the extensive video database of class label disclosed in utilizing, to constructed depth 3D residual errors network into
Row pre-training obtains initial weight parameter value;Based on the initial weight parameter value, the instruction in newborn's expression video library is utilized
Practice collection and verification collection sample, constructed depth 3D residual error networks are trained using the method for fine tuning, optimize network model
Parameter obtains trained network model;
(4) newborn's expression video segment to be tested is input to trained network model, carries out expression classification recognition,
Obtain pain Expression Recognition result.
As a further optimization solution of the present invention, the 3D residual errors sub-network includes the first sub-network, the second sub-network
With third sub-network;Each sub-network includes at least 3 3D residual units and 1 pond layer with different structure.
As a further optimization solution of the present invention, the 3D residual units in the 3D residual errors sub-network are respectively 3D residual errors
Unit A, 3D residual unit B and 3D residual unit C, the 3D residual units A includes the first branch and the second branch, the first branch
Including convolutional layer A1,3D convolution module A and convolutional layer A4 being linked in sequence, the second branch is shortcut connection (Shortcut
Connections) branch, it is non-linear using ReLU after the first branch is added pixel-by-pixel with the output of the second branch
Activation primitive layer exports;
The 3D residual units B includes the first branch and the second branch, the first branch include the convolutional layer B1 being linked in sequence,
3D convolution modules B and convolutional layer B4, the second branch are shortcut connection branches, by the output of the first branch and the second branch carry out by
After pixel is added, exported using ReLU nonlinear activation function layers;
The 3D residual units C includes the first branch and the second branch, the first branch include the convolutional layer C1 being linked in sequence,
3D convolution modules C and convolutional layer C4, the second branch are shortcut connection branches, by the output of the first branch and the second branch carry out by
After pixel is added, exported using ReLU nonlinear activation function layers.
As a further optimization solution of the present invention, the 2D residual units include the first branch and the second branch, and first
Branch includes 3 convolutional layers being linked in sequence 1, convolutional layer 2 and convolutional layer 3, and the second branch is shortcut connection branch, by first
After road is added pixel-by-pixel with the output of the second branch, then through the output of ReLU nonlinear activation function layers.
As a further optimization solution of the present invention, in the 3D residual units A, convolutional layer A1 uses m1A 1 × 1 × 1
Convolution kernel to input carry out convolution operation;3D convolution modules A includes convolutional layer A2 and convolutional layer A3, convolutional layer A2 and convolutional layer
The 3D convolution operations of spatial domain and time-domain are realized between A3 using serial mode, wherein convolutional layer A2 uses m1A 1 × k
The convolution kernel of × k carries out the output of convolutional layer A1 the convolution operation of spatial domain, and convolutional layer A3 uses m1The convolution of a d × 1 × 1
The output for checking convolutional layer A2 carries out the convolution operation of time-domain;Convolutional layer A4 uses m2A 1 × 1 × 1 convolution kernel is to 3D volumes
The output of volume module A carries out convolution operation;Wherein, m1It is chosen in 64,128,256 numerical value, k and d choose in 1,3 numerical value, m2
It is chosen in 256,512,1024 numerical value;
In the 3D residual units B, convolutional layer B1 uses m1A 1 × 1 × 1 convolution kernel carries out convolution operation to input;
3D convolution modules B includes convolutional layer B2 and convolutional layer B3, and sky is realized using parallel mode between convolutional layer B2 and convolutional layer B3
Between domain and time-domain 3D convolution operations, wherein convolutional layer B2 use m1Output of the convolution kernel of a 1 × k × k to convolutional layer B1
The convolution operation of spatial domain is carried out, convolutional layer B3 uses m1The convolution kernel of a d × 1 × 1 carries out the time to the output of convolutional layer B1
The convolution operation in domain, after convolutional layer B2 is added pixel-by-pixel with the output of convolutional layer B3, using ReLU nonlinear activation letters
Several layers of output, the input as convolutional layer B4;Convolutional layer B4 uses m2A 1 × 1 × 1 convolution kernel is defeated to 3D convolution modules B's
Go out to carry out convolution operation;
In the 3D residual units C, convolutional layer C1 uses m1A 1 × 1 × 1 convolution kernel carries out convolution operation to input;
3D convolution modules C includes convolutional layer C2 and convolutional layer C3, and branch is connected using with shortcut between convolutional layer C2 and convolutional layer C3
Serial mode realizes the 3D convolution operations of spatial domain and time-domain, wherein convolutional layer C2 uses m1The convolution kernel of a 1 × k × k
The convolution operation of spatial domain is carried out to the output of convolutional layer C1, convolutional layer C3 uses m1The convolution kernel of a d × 1 × 1 is to convolutional layer
The output of C2 carries out the convolution operation of time-domain, after convolutional layer C2 is added pixel-by-pixel with the output of convolutional layer C3, using
ReLU nonlinear activation function layers export, the input as convolutional layer C4;Convolutional layer C4 uses m2A 1 × 1 × 1 convolution kernel pair
The output of 3D convolution modules C carries out convolution operation.
As a further optimization solution of the present invention, in the 2D residual units, 3 convolutional layer 1, the volumes being linked in sequence
Lamination 2 and convolutional layer 3, the convolution kernel that m k × k is respectively adopted carry out convolution operation to input, wherein m is in 512,2048 numerical value
Middle selection.
As a further optimization solution of the present invention, in step (3), the full articulamentum of the depth 3D residual error networks is in institute
It states and training is finely adjusted using the learning rate bigger than predetermined threshold value on the basis of initial weight parameter value, remove the full articulamentum
Outside, other each layers of the depth residual error convolutional neural networks use on the basis of the initial weight parameter value than default threshold
It is worth small learning rate and is finely adjusted training.
A kind of neonatal pain Expression Recognition system based on depth 3D residual error networks that another aspect of the present invention provides, packet
It includes:
Sample process module cuts each video clip for acquiring required newborn's expression video segment sample
Volume at an isometric frame sequence, the newborn's expression video library for including pain expression class label is established, and by newborn's table
Sample in feelings video library is divided into training set and verification collects;
Network struction module is used to build the depth 3D residual error networks applied to neonatal pain Expression Recognition, including suitable
Sequence connection:Input layer, the first convolutional layer, the first pond layer, 3D residual errors sub-network, 2D residual errors sub-network, full articulamentum and
Softmax classification layers;
Input layer is used for input video sequence, every frame image in video sequence is normalized;
First convolutional layer, the video sequence after the normalization exported to input layer using several 3D convolution kernels carry out convolution
Operation, exports several feature graphic sequences;
First pond layer, the output that the first convolutional layer is checked using 3D pondizations carry out the maximum pond of spatial domain and time-domain
Operation, exports several feature graphic sequences;
The 3D residual errors sub-network includes 3 kinds of 3D residual units with different structure of several alternate cycles connection, with
And it is interspersed in the pond layer in 3D residual unit connection paths;The 3D residual units of 3 kinds of different structures are all made of 2D convolution
Realize that the 3D convolution operations of spatial domain and time-domain, combination are respectively the string without shortcut branch with 1D convolution combination operations
Line mode, parallel mode and the serial mode with shortcut branch;
The 2D residual errors sub-network, including at least three being linked in sequence have mutually isostructural 2D residual units and 1 pond
Change layer;
The output of 2D residual error sub-networks is fully connected to n output neuron, the feature of output one n dimensions by full articulamentum
Vector;
And Softmax classification layers, the feature vector for exporting full articulamentum are connected to corresponding expression classification entirely
N output node exports a n-dimensional vector, the number of each dimension represents the probability that input sample belongs to the category in vector,
Wherein n is expression class number;
Model training module, the extensive video database for there is class label disclosed in utilization, to constructed depth
It spends 3D residual error networks and carries out pre-training, obtain initial weight parameter value;And it is based on the initial weight parameter value, utilize new life
Training set in youngster's expression video library and verification collection sample, carry out constructed depth 3D residual error networks using the method for fine tuning
Training optimizes network model parameter, obtains trained network model;
And test module, for newborn's expression video segment to be tested to be input to trained network model, into
Row expression classification recognition obtains pain Expression Recognition result.
Advantageous effect:Compared with prior art, the invention has the advantages that:
(1) realize d × k × k's using the combination operation of the 1D convolution of 2D convolution sums d × 1 × 1 of a 1 × k × k
3D convolution operations, compared to the 2D convolutional neural networks of same depth, depth 3D residual errors network only adds a certain number of
1D convolution not will produce excessive growth in number of parameters, run time etc..
(2) have disclosed in utilization the extensive video database of class label to constructed depth 3D residual errors network into
When row pre-training, the 2D convolution kernels of spatial domain can use the large-scale image data library of disclosed tape label to carry out pre-training,
The 1D convolution kernels of only time-domain need random initializtion, can reduce the training difficulty of network in this way, accelerate the training speed of network
Degree.
(3) depth 3D residual errors network is from the expansion of depth residual error convolutional neural networks, by the way that shortcut connection is added
(Shortcut connections) branch constitutes basic residual unit, and backpropagation when the network number of plies is deepened can be effectively relieved
In gradient disappearance problem, and then solve the problems, such as deep layer network be difficult to training and performance degradation.
(4) time domain and spatial feature for using depth 3D residual error networks extraction video clip, by feature extraction from static map
As being extended to video sequence, the behavioral characteristics for capableing of reflecting time information can be independently extracted, the expressive features extracted can be with
The variation for preferably characterizing facial expression has stronger characterization ability and extensive energy relative to traditional artificial design features
Power, to finally promote the accuracy of Classification and Identification.
Description of the drawings
Fig. 1 is the neonatal pain expression recognition method flow provided in an embodiment of the present invention based on depth 3D residual error networks
Schematic diagram;
Fig. 2 is depth 3D residual error schematic network structures provided in an embodiment of the present invention;
Fig. 3 is the structural schematic diagram of 4 sub-networks in depth 3D residual error networks provided in an embodiment of the present invention;Wherein (a)-
(c) it is respectively 3 sub-networks in 3D residual error sub-networks, is (d) 2D residual error sub-networks;
Fig. 4 is the structural representation of the 3D residual units of 2D residual units provided in an embodiment of the present invention and 3 kinds of different structures
Figure;Wherein (a)-(c) is respectively 3D residual unit A-C, is (d) 2D residual units.
Specific implementation mode
Specific embodiments of the present invention are further described in detail with reference to the accompanying drawings of the specification.
As shown in Figure 1, a kind of neonatal pain expression based on depth 3D residual error networks provided in an embodiment of the present invention is known
Other method, mainly includes the following steps:
Step 1, acquisition needed for newborn's expression video segment sample, by each video clip be trimmed into one it is isometric
Frame sequence, establish and include newborn's expression video library of pain expression class label, and will be in newborn's expression video library
Sample is divided into training set and verification collects.
Acquisition newborn such as is injected intravenously, extracts blood sample at the pain expression video and peace when conventional induced pain operation
Expression video segment under the non-induced pain state cry and screamed under quiet state and because of reasons such as starvation, by professional person using in the world
Generally acknowledged evaluation neonatal pain tool --- newborn's Facial Coding System (Neonatal Facial Coding System,
NFCS), and in conjunction with other physical signs medically, the video of acquisition is assessed by 1~10 pain scores standard, it will
Score value is classified as mild pain expression between 1~5 expression, and score value is classified as severe pain table between 6~10 expression
Feelings.Finally, select the high 4 quasi-representative expressions of scoring consistency (it is quiet, cry, mild pain, severe pain) video clip, and
Each video clip is trimmed into the frame sequence of 16 frame lengths, establishes the newborn's expression video for including pain expression class label
Library, and the sample in newborn's expression video library is pressed 7:3 ratio cut partition is that training set and verification collect.In the present embodiment,
Quiet expression is marked with label 0, and the expression cried under non-pain status is marked with label 1, and mild pain expression is marked with label 2
Note, severe pain expression are marked with label 3.
The depth 3D residual error networks of step 2, structure applied to neonatal pain Expression Recognition.Constructed network includes suitable
Sequence connection:Input layer, the first convolutional layer, the first pond layer, 3D residual errors sub-network, 2D residual errors sub-network, full articulamentum and
Softmax classification layers;Its network core part is 3D residual error sub-networks, including 3 kinds of several alternate cycles connection have
The 3D residual units of different structure, and the pond layer that is interspersed in 3D residual unit connection paths;These 3D residual units are adopted
The 3D convolution operations that spatial domain and time-domain are realized with 2D convolution sum 1D convolution combination operations, to reduce network calculations complexity.
Wherein combination has the serial mode without shortcut branch, parallel mode and three kinds of the serial mode with shortcut branch.
Below with the concrete scene application of the present embodiment, constructed depth 3D residual error network structures are carried out specifically
It is bright, it is to be understood that those skilled in the art can the adjustment appropriate on the basis of this specific implementation agent structure, with suitable
It should specifically apply.
As shown in Fig. 2, the depth 3D residual error networks in the present embodiment include mainly:Input layer, the first convolutional layer, the first pond
Change layer, 3D residual errors sub-network, 2D residual errors sub-network, full articulamentum and Softmax classification layers.Wherein 3D residual errors sub-network is divided into
One sub-network, the second sub-network and third sub-network.
Each frame image normalization in the video sequence of 16 frame lengths of input is 160 × 160 pixels by input layer.
First convolutional layer, video sequence after the normalization exported to input layer using 64 1 × 7 × 7 convolution kernels into
Row convolution operation, using batch normalization (Batch Normalization, BN), nonlinear activation function ReLU mappings, output 64
A 16 × 80 × 80 feature graphic sequence.
First pond layer, the output that the first convolutional layer is checked using 2 × 3 × 3 pondization carry out spatial domain and time-domain
Maximum pondization operation, the feature graphic sequence that output is 64 8 × 39 × 39.
As shown in Fig. 3 (a), the first sub-network includes 3D residual unit A, 3D residual unit B, 3D the residual error lists being linked in sequence
First C and the second pond layer.
As shown in Fig. 4 (a), 3D residual units A includes the first branch and the second branch, and the first branch includes being linked in sequence
Convolutional layer A1,3D convolution module A and convolutional layer A4, the second branch is shortcut connection branch, by the first branch and the second branch
Output carries out after being added pixel-by-pixel, then using nonlinear activation function ReLU mappings, the characteristic pattern that output is 256 8 × 39 × 39
Sequence.Wherein, after convolutional layer A1 carries out convolution algorithm using the output of 64 1 × 1 × 1 the first pond of convolution kernel pair layers, warp
Cross batch normalization (BN), nonlinear activation function ReLU mappings, the feature graphic sequence that output is 64 8 × 39 × 39;3D convolution moulds
Block A includes convolutional layer A2 and convolutional layer A3, between convolutional layer A2 and convolutional layer A3 using serial mode come realize spatial domain and when
Between domain 3D convolution operations, wherein convolutional layer A2 carries out space using 64 1 × 3 × 3 convolution kernels to the output of convolutional layer A1
After the convolution operation in domain, by batch normalization (BN), nonlinear activation function ReLU mappings, the spy that output is 64 8 × 39 × 39
Levy graphic sequence;Convolutional layer A3 carries out the output of convolutional layer A2 using 64 3 × 1 × 1 convolution kernels the convolution operation of time-domain
Afterwards, by batch normalization (BN), the feature graphic sequence that output is 64 8 × 39 × 39;Convolutional layer A4 is using 256 1 × 1 × 1
After convolution kernel carries out convolution operation to the output of 3D convolution modules A, by batch normalization (BN), 256 8 × 39 × 39 are exported
Feature graphic sequence.
As shown in Fig. 4 (b), 3D residual units B includes the first branch and the second branch, and the first branch includes being linked in sequence
Convolutional layer B1,3D convolution module B and convolutional layer B4, the second branch is shortcut connection branch, by the first branch and the second branch
Output carries out after being added pixel-by-pixel, then using nonlinear activation function ReLU mappings, the characteristic pattern that output is 256 8 × 39 × 39
Sequence.Wherein, after convolutional layer B1 carries out convolution operation using 64 1 × 1 × 1 convolution kernels to the output of 3D residual units A, warp
Cross batch normalization (BN), nonlinear activation function ReLU mappings, the feature graphic sequence that output is 64 8 × 39 × 39;3D convolution moulds
Block B includes convolutional layer B2 and convolutional layer B3, between convolutional layer B2 and convolutional layer B3 using parallel mode come realize spatial domain and when
Between domain 3D convolution operations, wherein convolutional layer B2 carries out space using 64 1 × 3 × 3 convolution kernels to the output of convolutional layer B1
After the convolution operation in domain, mapped by batch normalization (BN), nonlinear activation function ReLU, 64 8 × 39 × 39 dimensions of output
Feature graphic sequence;The convolution that convolutional layer B3 carries out the output of convolutional layer B1 using 64 3 × 1 × 1 convolution kernels time-domain is grasped
After work, by batch normalization (BN), the feature graphic sequence that output is 64 8 × 39 × 39;The output of convolutional layer B2 and convolutional layer B3
It carries out after being added pixel-by-pixel, then using nonlinear activation function ReLU mappings, exports the characteristic pattern sequence of 64 8 × 39 × 39 dimensions
Row;After convolutional layer B4 carries out convolution operation using 256 1 × 1 × 1 convolution kernels to the output of 3D convolution modules B, by batch rule
One changes (BN), the feature graphic sequence that output is 256 8 × 39 × 39.
As shown in Fig. 4 (c), 3D residual units C includes the first branch and the second branch, and the first branch includes being linked in sequence
Convolutional layer C1,3D convolution module C and convolutional layer C4, the second branch is shortcut connection branch, by the first branch and the second branch
Output carries out after being added pixel-by-pixel, then using nonlinear activation function ReLU mappings, the characteristic pattern that output is 256 8 × 39 × 39
Sequence.Wherein, after convolutional layer C1 carries out convolution operation using 64 1 × 1 × 1 convolution kernels to the output of 3D residual units B, warp
Cross batch normalization (BN), nonlinear activation function ReLU mappings, the feature graphic sequence that output is 64 8 × 39 × 39;3D convolution moulds
Block C includes convolutional layer C2 and convolutional layer C3, and the serial mode that branch is connected with shortcut is used between convolutional layer C2 and convolutional layer C3
To realize the 3D convolution operations of spatial domain and time-domain, wherein convolutional layer C2 is using 64 1 × 3 × 3 convolution kernels to convolutional layer
After the output of C1 carries out the convolution operation of spatial domain, by batch normalization (BN), nonlinear activation function ReLU mappings, output 64
A 8 × 39 × 39 feature graphic sequence;When convolutional layer C3 carries out the output of convolutional layer C2 using 64 3 × 1 × 1 convolution kernels
Between domain convolution operation after, by batch normalization (BN), the feature graphic sequence that output is 64 8 × 39 × 39;Convolutional layer C2 and volume
The output of lamination C3 carries out after being added pixel-by-pixel, then using nonlinear activation function ReLU mappings, exports 64 8 × 39 × 39
Feature graphic sequence;After convolutional layer C4 carries out convolution operation using 256 1 × 1 × 1 convolution kernels to the output of 3D convolution modules C,
By batch normalization (BN), the feature graphic sequence that output is 256 8 × 39 × 39.
Second pond layer carries out time-domain and spatial domain using 2 × 1 × 1 Chi Huahe to the output of 3D residual units C
Maximum pond, 256 4 × 39 × 39 feature graphic sequences of output.
As shown in Fig. 3 (b), the second sub-network includes 3D residual unit A, 3D residual unit B, 3D the residual error lists being linked in sequence
First C, 3D residual unit A, 3D residual unit B, 3D residual unit C, 3D residual unit A, 3D residual unit B and third pond layer,
The feature graphic sequence that output is 512 2 × 20 × 20.
As shown in Fig. 3 (c), third sub-network is made of 36 3D residual units and 1 pond layer, and the order of connection is:3D
The ponds residual unit C → 3D residual unit A → 3D residual unit B →... ... 3D residual unit A → 3D residual units B → the 4th
Layer, the feature graphic sequence that output is 1024 1 × 10 × 10.
As shown in Fig. 3 (d), 2D residual error sub-networks include the 3 2D residual units and the 5th pond layer being linked in sequence, output
2048 1 × 1 characteristic patterns.
As shown in Fig. 4 (d), 2D residual units include the first branch and the second branch, and the first branch is linked in sequence including 3
Convolutional layer 1, convolutional layer 2 and convolutional layer 3, the second branch be shortcut connection branch, by the output of the first branch and the second branch
It carries out after being added pixel-by-pixel, then using nonlinear activation function ReLU mappings, the characteristic pattern that output is 2048 5 × 5.Wherein, it rolls up
After lamination 1 carries out convolution algorithm using 512 1 × 1 convolution kernels, by batch normalization (BN), nonlinear activation function ReLU
Mapping, the characteristic pattern that output is 512 5 × 5;Convolutional layer 2 rolls up the output of convolutional layer 1 using 512 3 × 3 convolution kernels
After product operation, by batch normalization (BN), nonlinear activation function ReLU mappings, the characteristic pattern that output is 512 5 × 5;Convolutional layer
After 3 carry out convolution algorithm using 2048 1 × 1 convolution kernels to the output of convolutional layer 2, by batch normalization (BN), output
2048 5 × 5 characteristic patterns.
5th pond layer carries out average pondization operation to the output of third residual unit, and uses dropout method tune
Whole connection weight, the characteristic pattern that output is 2048 1 × 1.
The output of 2D residual error sub-networks is fully connected to 4 output neurons of this layer by full articulamentum, exports one 4 dimension
Feature vector;
Softmax classifies layer, and be connected to corresponding expression classification entirely 4 of the feature vector for exporting full articulamentum are defeated
Egress exports 4 dimensional vectors, the number of each dimension represents the probability that input sample belongs to the category in vector.
Step 3, using the disclosed Kinetics video databases for having class label, to constructed depth 3D residual error nets
Network carries out pre-training, obtains initial weight parameter value;Based on the initial weight parameter value, using in newborn's expression video library
Training set and verification collection sample, constructed depth 3D residual error networks are trained using the method for fine tuning, optimize network
Model parameter obtains trained network model.
Depth 3D residual error networks are trained based on training set and transfer learning method in this step, and on verification collection
It is tested, obtains trained depth 3D residual error networks.The thought of wherein transfer learning is first by constructed depth 3D
Residual error network carries out pre-training on the data set at one with sufficient training sample, and the good model parameter of pre-training is moved to
Object module, making object module, there are one good initial weight parameter values, have the ability extracted to video image characteristic.So
Afterwards, it allows depth 3D residual errors network to be finely adjusted (Fine-Tuning) on the newborn's expression video library established, that is, is based on instruction
Practice collection, the depth 3D residual errors network after pre-training is finely trained.
In this example, first on Kinetics video databases to depth 3D residual error networks carry out pre-training (can be selected with
The corresponding video of the identical label classification number of neonatal pain expression class number of label is trained), by pre-training, make
Depth 3D residual error networks constructed by us have the ability extracted to video image characteristic, and network is made to obtain preferable initial power
Weight parameter value.In addition, in order to further decrease the training difficulty of network, accelerates training speed, there is classification mark disclosed in utilization
When the extensive video database of label carries out pre-training to constructed depth 3D residual error networks, spatial domain in 3D residual error networks
2D convolution kernels can first use the large-scale image data library (such as ImageNet) of disclosed tape label to carry out pre-training, only time
The 1D convolution kernels in domain need random initializtion.
After obtaining initial weight parameter value, constructed depth 3D residual error networks are carried out on newborn's expression video library
It finely tunes, the update rule of weight parameter is in trim process:The full articulamentum of depth 3D residual error networks is in initial weight parameter value
On the basis of be trained using the learning rate bigger than predetermined threshold value, in addition to full articulamentum, other of depth 3D residual error networks are each
Layer be trained using the learning rate smaller than predetermined threshold value on the basis of initial weight parameter value, wherein predetermined threshold value according to
Concrete condition in hands-on determines.For example, in addition to the full articulamentum of last layer, other layers in depth 3D residual error networks
It is trained, that is, allowed in network except the power of other each layers of full articulamentum using 0.001 learning rate on original parameter basis
Weight parameter keeps updating more by a small margin on the basis of the initial weight parameter value that pre-training obtains, the full articulamentum of last layer
Then 0.01 learning rate is used to be trained.Specifically, training uses Softmax loss functions, is optimized using gradient descent method
The loss function, to carry out the update of network parameter.
Newborn's expression video segment to be tested is input to trained network model by step 4, carries out expression classification knowledge
Not, pain Expression Recognition result is obtained.
Based on identical inventive concept, a kind of new life based on depth 3D residual error networks disclosed in another embodiment of the present invention
Youngster's pain Expression Recognition system, including:Sample process module will for acquiring required newborn's expression video segment sample
Each video clip is trimmed into an isometric frame sequence, establishes the newborn's expression video for including pain expression class label
Library, and the sample in newborn's expression video library is divided into training set and verification collection;Network struction module, for building application
In the depth 3D residual error networks of neonatal pain Expression Recognition, including be linked in sequence:Input layer, the first convolutional layer, the first pond
Change layer, 3D residual errors sub-network, 2D residual errors sub-network, full articulamentum and Softmax classification layers;Model training module, for utilizing
The disclosed extensive video database for having class label carries out pre-training to constructed depth 3D residual error networks, obtains just
Beginning weight parameter value;And be based on the initial weight parameter value, using in newborn's expression video library training set and verification
Collect sample, constructed depth 3D residual error networks are trained using the method for fine tuning, optimizes network model parameter, instructed
The network model perfected;And test module, for newborn's expression video segment to be tested to be input to trained network
Model carries out expression classification recognition, obtains pain Expression Recognition result.The specific implementation details of the present embodiment please refer to above-mentioned side
Method embodiment, details are not described herein again.
It will be understood by those skilled in the art that can carry out adaptively changing and it to the module in embodiment
Be arranged in the one or more systems different from the embodiment.Can in embodiment module or unit or component combine
At a module or unit or component, and it can be divided into multiple submodule or subelement or sub-component in addition.
The above, the only specific implementation mode in the present invention, but scope of protection of the present invention is not limited thereto, appoints
What is familiar with the people of the technology within the technical scope disclosed by the invention, it will be appreciated that expects transforms or replaces, and should all cover
Within the scope of the present invention, therefore, the scope of protection of the invention shall be subject to the scope of protection specified in the patent claim.
Claims (8)
1. a kind of neonatal pain expression recognition method based on depth 3D residual error networks, which is characterized in that include the following steps:
(1) newborn's expression video segment sample needed for acquisition, an isometric frame sequence is trimmed by each video clip
Row establish the newborn's expression video library for including pain expression class label, and the sample in newborn's expression video library are drawn
It is divided into training set and verification collects;
(2) depth 3D residual error network of the structure applied to neonatal pain Expression Recognition, including be linked in sequence:Input layer,
One convolutional layer, the first pond layer, 3D residual errors sub-network, 2D residual errors sub-network, full articulamentum and Softmax classification layers;
Input layer is used for input video sequence, every frame image in video sequence is normalized;
First convolutional layer, the video sequence after the normalization exported to input layer using several 3D convolution kernels carry out convolution behaviour
Make, exports several feature graphic sequences;
First pond layer, the output that the first convolutional layer is checked using 3D pondizations carries out spatial domain and the maximum pondization of time-domain is grasped
Make, exports several feature graphic sequences;
The 3D residual errors sub-network, includes 3 kinds of 3D residual units with different structure of several alternate cycles connection, and wears
The pond layer being inserted in 3D residual unit connection paths;The 3D residual units of 3 kinds of different structures are all made of 2D convolution sums 1D
Convolution combination operation realizes that the 3D convolution operations of spatial domain and time-domain, combination are respectively the serial side without shortcut branch
Formula, parallel mode and the serial mode with shortcut branch;
The 2D residual errors sub-network, including at least three being linked in sequence have mutually isostructural 2D residual units and 1 pond
Layer;
The output of 2D residual error sub-networks is fully connected to the n output neuron of this layer, the spy of output one n dimensions by full articulamentum
Sign vector;
And Softmax classification layers, the feature vector for exporting full articulamentum are connected to n of corresponding expression classification entirely
Output node exports a n-dimensional vector, the number of each dimension represents the probability that input sample belongs to the category in vector,
Middle n is expression class number;
(3) the extensive video database for having class label disclosed in utilizing carries out constructed depth 3D residual error networks pre-
Training, obtains initial weight parameter value;Based on the initial weight parameter value, the training set in newborn's expression video library is utilized
Collect sample with verification, constructed depth 3D residual error networks be trained using the method for fine tuning, optimizes network model parameter,
Obtain trained network model;
(4) newborn's expression video segment to be tested is input to trained network model, carries out expression classification recognition, obtains
Pain Expression Recognition result.
2. the neonatal pain expression recognition method according to claim 1 based on depth 3D residual error networks, feature exist
In the 3D residual errors sub-network includes the first sub-network, the second sub-network and third sub-network;Each sub-network includes at least 3
3D residual units with different structure and 1 pond layer.
3. the neonatal pain expression recognition method according to claim 1 based on depth 3D residual error networks, feature exist
In, the 3D residual units in the 3D residual errors sub-network are respectively 3D residual unit A, 3D residual unit B and 3D residual unit C,
The 3D residual units A includes the first branch and the second branch, and the first branch includes convolutional layer A1,3D the convolution mould being linked in sequence
Block A and convolutional layer A4, the second branch are shortcut connection branches, and the first branch is added pixel-by-pixel with the output of the second branch
Afterwards, it is exported using ReLU nonlinear activation function layers;
The 3D residual units B includes the first branch and the second branch, and the first branch includes convolutional layer B1, the 3D volume being linked in sequence
Volume module B and convolutional layer B4, the second branch are shortcut connection branches, and the output of the first branch and the second branch is carried out pixel-by-pixel
After addition, exported using ReLU nonlinear activation function layers;
The 3D residual units C includes the first branch and the second branch, and the first branch includes convolutional layer C1, the 3D volume being linked in sequence
Volume module C and convolutional layer C4, the second branch are shortcut connection branches, and the output of the first branch and the second branch is carried out pixel-by-pixel
After addition, exported using ReLU nonlinear activation function layers.
4. the neonatal pain expression recognition method according to claim 1 based on depth 3D residual error networks, feature exist
In the 2D residual units include the first branch and the second branch, and the first branch includes 3 convolutional layers being linked in sequence 1, convolution
Layer 2 and convolutional layer 3, the second branch are shortcut connection branches, and the first branch is added pixel-by-pixel with the output of the second branch
Afterwards, then through ReLU nonlinear activation function layers it exports.
5. the neonatal pain expression recognition method according to claim 3 based on depth 3D residual error networks, feature exist
In in the 3D residual units A, convolutional layer A1 uses m1A 1 × 1 × 1 convolution kernel carries out convolution operation to input;3D convolution
Modules A includes convolutional layer A2 and convolutional layer A3, between convolutional layer A2 and convolutional layer A3 using serial mode come realize spatial domain and
The 3D convolution operations of time-domain, wherein convolutional layer A2 uses m1The convolution kernel of a 1 × k × k carries out the output of convolutional layer A1 empty
Between domain convolution operation, convolutional layer A3 use m1The convolution kernel of a d × 1 × 1 carries out the output of convolutional layer A2 the volume of time-domain
Product operation;Convolutional layer A4 uses m2A 1 × 1 × 1 convolution kernel carries out convolution operation to the output of 3D convolution modules A;Wherein, m1
It is chosen in 64,128,256 numerical value, k and d choose in 1,3 numerical value, m2It is chosen in 256,512,1024 numerical value;
In the 3D residual units B, convolutional layer B1 uses m1A 1 × 1 × 1 convolution kernel carries out convolution operation to input;3D convolution
Module B includes convolutional layer B2 and convolutional layer B3, between convolutional layer B2 and convolutional layer B3 using parallel mode come realize spatial domain and
The 3D convolution operations of time-domain, wherein convolutional layer B2 uses m1The convolution kernel of a 1 × k × k carries out the output of convolutional layer B1 empty
Between domain convolution operation, convolutional layer B3 use m1The convolution kernel of a d × 1 × 1 carries out the output of convolutional layer B1 the volume of time-domain
Product operation, it is defeated using ReLU nonlinear activation function layers after convolutional layer B2 is added pixel-by-pixel with the output of convolutional layer B3
Go out, the input as convolutional layer B4;Convolutional layer B4 uses m2A 1 × 1 × 1 convolution kernel carries out the output of 3D convolution modules B
Convolution operation;
In the 3D residual units C, convolutional layer C1 uses m1A 1 × 1 × 1 convolution kernel carries out convolution operation to input;3D convolution
Module C includes convolutional layer C2 and convolutional layer C3, and the serial side that branch is connected with shortcut is used between convolutional layer C2 and convolutional layer C3
Formula realizes the 3D convolution operations of spatial domain and time-domain, wherein convolutional layer C2 uses m1The convolution kernel of a 1 × k × k is to convolution
The output of layer C1 carries out the convolution operation of spatial domain, and convolutional layer C3 uses m1The convolution kernel of a d × 1 × 1 is defeated to convolutional layer C2's
Go out to carry out the convolution operation of time-domain, it is non-using ReLU after convolutional layer C2 is added pixel-by-pixel with the output of convolutional layer C3
Linear activation primitive layer output, the input as convolutional layer C4;Convolutional layer C4 uses m2A 1 × 1 × 1 convolution kernel is to 3D convolution
The output of module C carries out convolution operation.
6. the neonatal pain expression recognition method according to claim 4 based on depth 3D residual error networks, feature exist
In in the 2D residual units, the volume of m k × k is respectively adopted in 3 convolutional layer 1, convolutional layer 2 and the convolutional layers 3 being linked in sequence
Product verification input carries out convolution operation, wherein m chooses in 512,2048 numerical value.
7. the neonatal pain expression recognition method according to claim 1 based on depth 3D residual error networks, feature exist
In in step (3), the full articulamentum of the depth 3D residual error networks uses on the basis of the initial weight parameter value than pre-
If the big learning rate of threshold value is finely adjusted training, in addition to the full articulamentum, the depth residual error convolutional neural networks other
Each layer is finely adjusted training on the basis of the initial weight parameter value using the learning rate smaller than predetermined threshold value.
8. a kind of neonatal pain Expression Recognition system based on depth 3D residual error networks, which is characterized in that including:
Each video clip is trimmed by sample process module for acquiring required newborn's expression video segment sample
One isometric frame sequence establishes the newborn's expression video library for including pain expression class label, and newborn's expression is regarded
Sample in frequency library is divided into training set and verification collects;
Network struction module is used to build the depth 3D residual error networks applied to neonatal pain Expression Recognition, including sequentially connects
It connects:Input layer, the first convolutional layer, the first pond layer, 3D residual errors sub-network, 2D residual errors sub-network, full articulamentum and Softmax
Classification layer;
Input layer is used for input video sequence, every frame image in video sequence is normalized;
First convolutional layer, the video sequence after the normalization exported to input layer using several 3D convolution kernels carry out convolution behaviour
Make, exports several feature graphic sequences;
First pond layer, the output that the first convolutional layer is checked using 3D pondizations carries out spatial domain and the maximum pondization of time-domain is grasped
Make, exports several feature graphic sequences;
The 3D residual errors sub-network, includes 3 kinds of 3D residual units with different structure of several alternate cycles connection, and wears
The pond layer being inserted in 3D residual unit connection paths;The 3D residual units of 3 kinds of different structures are all made of 2D convolution sums 1D
Convolution combination operation realizes that the 3D convolution operations of spatial domain and time-domain, combination are respectively the serial side without shortcut branch
Formula, parallel mode and the serial mode with shortcut branch;
The 2D residual errors sub-network, including at least three being linked in sequence have mutually isostructural 2D residual units and 1 pond
Layer;
The output of 2D residual error sub-networks is fully connected to n output neuron, the feature vector of output one n dimensions by full articulamentum;
And Softmax classification layers, the feature vector for exporting full articulamentum are connected to n of corresponding expression classification entirely
Output node exports a n-dimensional vector, the number of each dimension represents the probability that input sample belongs to the category in vector,
Middle n is expression class number;
Model training module, the extensive video database for there is class label disclosed in utilization, to constructed depth 3D
Residual error network carries out pre-training, obtains initial weight parameter value;And it is based on the initial weight parameter value, utilize newborn's table
Training set in feelings video library and verification collection sample, instruct constructed depth 3D residual error networks using the method for fine tuning
Practice, optimizes network model parameter, obtain trained network model;
And test module carries out table for newborn's expression video segment to be tested to be input to trained network model
Feelings Classification and Identification obtains pain Expression Recognition result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810346075.3A CN108596069A (en) | 2018-04-18 | 2018-04-18 | Neonatal pain expression recognition method and system based on depth 3D residual error networks |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810346075.3A CN108596069A (en) | 2018-04-18 | 2018-04-18 | Neonatal pain expression recognition method and system based on depth 3D residual error networks |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108596069A true CN108596069A (en) | 2018-09-28 |
Family
ID=63613362
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810346075.3A Pending CN108596069A (en) | 2018-04-18 | 2018-04-18 | Neonatal pain expression recognition method and system based on depth 3D residual error networks |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108596069A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110348420A (en) * | 2019-07-18 | 2019-10-18 | 腾讯科技(深圳)有限公司 | Sign Language Recognition Method, device, computer readable storage medium and computer equipment |
CN110674488A (en) * | 2019-09-06 | 2020-01-10 | 深圳壹账通智能科技有限公司 | Verification code identification method and system based on neural network and computer equipment |
WO2020098257A1 (en) * | 2018-11-14 | 2020-05-22 | 平安科技(深圳)有限公司 | Image classification method and device and computer readable storage medium |
CN111222457A (en) * | 2020-01-06 | 2020-06-02 | 电子科技大学 | Detection method for identifying video authenticity based on depth separable convolution |
CN111310516A (en) * | 2018-12-11 | 2020-06-19 | 杭州海康威视数字技术股份有限公司 | Behavior identification method and device |
CN111428771A (en) * | 2019-11-08 | 2020-07-17 | 腾讯科技(深圳)有限公司 | Video scene classification method and device and computer-readable storage medium |
CN111462049A (en) * | 2020-03-09 | 2020-07-28 | 西南交通大学 | Automatic lesion area form labeling method in mammary gland ultrasonic radiography video |
WO2020248841A1 (en) * | 2019-06-13 | 2020-12-17 | 平安科技(深圳)有限公司 | Au detection method and apparatus for image, and electronic device and storage medium |
CN112800894A (en) * | 2021-01-18 | 2021-05-14 | 南京邮电大学 | Dynamic expression recognition method and system based on attention mechanism between space and time streams |
CN113180594A (en) * | 2021-03-09 | 2021-07-30 | 山西三友和智慧信息技术股份有限公司 | Method for evaluating postoperative pain of newborn through multidimensional space-time deep learning |
CN113313056A (en) * | 2021-06-16 | 2021-08-27 | 中国科学技术大学 | Compact 3D convolution-based lip language identification method, system, device and storage medium |
CN116796818A (en) * | 2022-03-15 | 2023-09-22 | 生物岛实验室 | Model training method, device, equipment, storage medium and program product |
CN110674488B (en) * | 2019-09-06 | 2024-04-26 | 深圳壹账通智能科技有限公司 | Verification code identification method, system and computer equipment based on neural network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2002366825A1 (en) * | 2001-12-20 | 2003-07-09 | Koninklijke Philips Electronics N.V. | Video encoding and decoding method and device |
JP2008310775A (en) * | 2007-06-18 | 2008-12-25 | Canon Inc | Expression recognition device and method and imaging apparatus |
CN106570474A (en) * | 2016-10-27 | 2017-04-19 | 南京邮电大学 | Micro expression recognition method based on 3D convolution neural network |
CN107392109A (en) * | 2017-06-27 | 2017-11-24 | 南京邮电大学 | A kind of neonatal pain expression recognition method based on deep neural network |
-
2018
- 2018-04-18 CN CN201810346075.3A patent/CN108596069A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2002366825A1 (en) * | 2001-12-20 | 2003-07-09 | Koninklijke Philips Electronics N.V. | Video encoding and decoding method and device |
JP2008310775A (en) * | 2007-06-18 | 2008-12-25 | Canon Inc | Expression recognition device and method and imaging apparatus |
CN106570474A (en) * | 2016-10-27 | 2017-04-19 | 南京邮电大学 | Micro expression recognition method based on 3D convolution neural network |
CN107392109A (en) * | 2017-06-27 | 2017-11-24 | 南京邮电大学 | A kind of neonatal pain expression recognition method based on deep neural network |
Non-Patent Citations (1)
Title |
---|
ZHAOFAN QIU等: "Learning Spatio-Temporal Representation with Pseudo-3D Residual Network", 《COMPUTER VISION FOUNDATION》 * |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020098257A1 (en) * | 2018-11-14 | 2020-05-22 | 平安科技(深圳)有限公司 | Image classification method and device and computer readable storage medium |
CN111310516A (en) * | 2018-12-11 | 2020-06-19 | 杭州海康威视数字技术股份有限公司 | Behavior identification method and device |
CN111310516B (en) * | 2018-12-11 | 2023-08-29 | 杭州海康威视数字技术股份有限公司 | Behavior recognition method and device |
WO2020248841A1 (en) * | 2019-06-13 | 2020-12-17 | 平安科技(深圳)有限公司 | Au detection method and apparatus for image, and electronic device and storage medium |
CN110348420A (en) * | 2019-07-18 | 2019-10-18 | 腾讯科技(深圳)有限公司 | Sign Language Recognition Method, device, computer readable storage medium and computer equipment |
CN110674488B (en) * | 2019-09-06 | 2024-04-26 | 深圳壹账通智能科技有限公司 | Verification code identification method, system and computer equipment based on neural network |
CN110674488A (en) * | 2019-09-06 | 2020-01-10 | 深圳壹账通智能科技有限公司 | Verification code identification method and system based on neural network and computer equipment |
CN111428771B (en) * | 2019-11-08 | 2023-04-18 | 腾讯科技(深圳)有限公司 | Video scene classification method and device and computer-readable storage medium |
CN111428771A (en) * | 2019-11-08 | 2020-07-17 | 腾讯科技(深圳)有限公司 | Video scene classification method and device and computer-readable storage medium |
CN111222457A (en) * | 2020-01-06 | 2020-06-02 | 电子科技大学 | Detection method for identifying video authenticity based on depth separable convolution |
CN111222457B (en) * | 2020-01-06 | 2023-06-16 | 电子科技大学 | Detection method for identifying authenticity of video based on depth separable convolution |
CN111462049A (en) * | 2020-03-09 | 2020-07-28 | 西南交通大学 | Automatic lesion area form labeling method in mammary gland ultrasonic radiography video |
CN111462049B (en) * | 2020-03-09 | 2022-05-17 | 西南交通大学 | Automatic lesion area form labeling method in mammary gland ultrasonic radiography video |
CN112800894A (en) * | 2021-01-18 | 2021-05-14 | 南京邮电大学 | Dynamic expression recognition method and system based on attention mechanism between space and time streams |
CN112800894B (en) * | 2021-01-18 | 2022-08-26 | 南京邮电大学 | Dynamic expression recognition method and system based on attention mechanism between space and time streams |
CN113180594A (en) * | 2021-03-09 | 2021-07-30 | 山西三友和智慧信息技术股份有限公司 | Method for evaluating postoperative pain of newborn through multidimensional space-time deep learning |
CN113313056A (en) * | 2021-06-16 | 2021-08-27 | 中国科学技术大学 | Compact 3D convolution-based lip language identification method, system, device and storage medium |
CN116796818A (en) * | 2022-03-15 | 2023-09-22 | 生物岛实验室 | Model training method, device, equipment, storage medium and program product |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108596069A (en) | Neonatal pain expression recognition method and system based on depth 3D residual error networks | |
US20220148191A1 (en) | Image segmentation method and apparatus and storage medium | |
CN108388890A (en) | A kind of neonatal pain degree assessment method and system based on human facial expression recognition | |
CN105740612B (en) | Disease treatment system based on tcm clinical practice case | |
CN109820525A (en) | A kind of driving fatigue recognition methods based on CNN-LSTM deep learning model | |
CN111631688B (en) | Algorithm for automatic sleep staging | |
CN107799165A (en) | A kind of psychological assessment method based on virtual reality technology | |
CN107392109A (en) | A kind of neonatal pain expression recognition method based on deep neural network | |
CN109303560A (en) | A kind of atrial fibrillation recognition methods of electrocardiosignal in short-term based on convolution residual error network and transfer learning | |
CN108198620A (en) | A kind of skin disease intelligent auxiliary diagnosis system based on deep learning | |
CN106778014A (en) | A kind of risk Forecasting Methodology based on Recognition with Recurrent Neural Network | |
CN106682616A (en) | Newborn-painful-expression recognition method based on dual-channel-characteristic deep learning | |
CN107016438A (en) | A kind of system based on Chinese medical discrimination artificial neural network algorithm model | |
CN110322962A (en) | A kind of method automatically generating diagnostic result, system and computer equipment | |
CN108363979A (en) | Neonatal pain expression recognition method based on binary channels Three dimensional convolution neural network | |
CN112489769A (en) | Intelligent traditional Chinese medicine diagnosis and medicine recommendation system for chronic diseases based on deep neural network | |
CN106955112A (en) | Brain wave Emotion recognition method based on Quantum wavelet neural networks model | |
CN106909938A (en) | Viewing angle independence Activity recognition method based on deep learning network | |
CN106959946A (en) | A kind of text semantic feature generation optimization method based on deep learning | |
CN110175510A (en) | Multi-mode Mental imagery recognition methods based on brain function network characterization | |
CN109359610A (en) | Construct method and system, the data characteristics classification method of CNN-GB model | |
CN110659420A (en) | Personalized catering method based on deep neural network Monte Carlo search tree | |
CN106355574A (en) | Intra-abdominal adipose tissue segmentation method based on deep learning | |
CN112932501A (en) | Method for automatically identifying insomnia based on one-dimensional convolutional neural network | |
CN114145745B (en) | Graph-based multitasking self-supervision emotion recognition method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180928 |