CN110826462A - Human body behavior identification method of non-local double-current convolutional neural network model - Google Patents

Human body behavior identification method of non-local double-current convolutional neural network model Download PDF

Info

Publication number
CN110826462A
CN110826462A CN201911053686.XA CN201911053686A CN110826462A CN 110826462 A CN110826462 A CN 110826462A CN 201911053686 A CN201911053686 A CN 201911053686A CN 110826462 A CN110826462 A CN 110826462A
Authority
CN
China
Prior art keywords
layer
local
neural network
behavior recognition
double
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911053686.XA
Other languages
Chinese (zh)
Inventor
周云
陈淑荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Maritime University
Original Assignee
Shanghai Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Maritime University filed Critical Shanghai Maritime University
Priority to CN201911053686.XA priority Critical patent/CN110826462A/en
Publication of CN110826462A publication Critical patent/CN110826462A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention relates to a human behavior recognition method of a non-local double-current convolutional neural network model, which improves two branch networks on the basis of the double-current convolutional neural network model, adds a non-local feature extraction module in a space current CNN and a time current CNN for extracting more comprehensive and clearer feature maps, deepens the depth of the network to a certain extent, effectively relieves network overfitting, can also extract the non-local features of a sample, performs denoising processing on an input feature map, and solves the problems of low recognition accuracy rate caused by complex background environment, various human behaviors, large action similarity and the like in a behavior video. The invention also trains the loss layer by adopting an A-softmax loss function, adds m-times limit to the classification angle on the basis of the softmax function, and limits the weight W and the bias b of the full connection layer, so that the distance between classes of the sample is larger, the distance in the class is smaller, better identification precision is obtained, and finally, a deep learning model with stronger identification capability is obtained.

Description

Human body behavior identification method of non-local double-current convolutional neural network model
Technical Field
The invention relates to a computer visual image and video processing technology, in particular to a human behavior identification method of a non-local double-current convolution neural network model.
Background
The research of human behavior recognition is to give a computer the ability to resemble the vision of human beings, so that the computer can acquire information through a visual system like human beings. And analyzing and processing the human actions in the video, and classifying and understanding the human behaviors by automatically tracking the global information and the local information of the human behaviors. Due to the influence of reasons such as complex background environment, various human behaviors, large motion similarity and the like in the behavior video, the phenomenon that two similar behavior motions are classified into one class can occur, and the accuracy rate of human behavior identification is low. Human behavior recognition is therefore still a challenging task in computer vision. At present, research on video behavior recognition in academic circles is mainly divided into two directions, a traditional behavior recognition method based on machine learning and a behavior recognition method based on deep learning are adopted, manual features are extracted in the traditional method, great errors exist, the method with Convolutional Neural Network (CNN) as a representative of deep learning is adopted for improving the accuracy of behavior recognition, and the method becomes a popular research direction in recent years. The deep learning method shows strong feature extraction capability by automatically learning features through a network, and can learn the features with high self-adaption and discrimination aiming at tasks. Deep learning has achieved good results in the aspect of character action recognition.
Disclosure of Invention
In order to solve the problem of low recognition accuracy rate caused by complex background environment, various human behaviors, large action similarity and the like in a behavior video, the invention provides a human behavior recognition method of a non-local double-current convolutional neural network model, which designs the non-local double-current convolutional neural network model and combines the traditional CNN with a non-local feature extraction module; and (3) adopting an A-softmax Loss function to enable the inter-class distance to be larger and the intra-class distance to be smaller in the final classification of the two branch models of the double flows.
The invention provides a human behavior recognition method of a non-local double-current convolutional neural network model, which trains a data sample by using a network structure combining a CNN model and a non-local neural network module: extracting features and dimensionality reduction of a pooling layer through a convolutional layer during each processing except the last processing, performing non-local feature extraction and feature denoising through the non-local neural network module, and then performing next convolutional layer and pooling layer processing; after the last convolution layer and the last pooling layer are processed, feature vectors are obtained by summarizing the features of the data samples through the full-connection layer, and then classification and normalization are carried out through the loss layer.
Preferably, an a-softmax loss function is adopted in the loss layer, m times of limitation is performed on the classification angle, and limitations of | | | | W | | ═ 1 and | | | | b | | | | | 0 are performed on the weight W and the offset b of the fully-connected layer.
Preferably, a double-current convolutional neural network model is adopted to extract spatial appearance information and temporal motion information of the video sample through a spatial stream CNN model and a temporal stream CNN model;
an input video sample set is preprocessed to obtain RGB frames and optical flow images, the RGB frames and the optical flow images are divided into a training set and a testing set, and the training set and the testing set are respectively sent to a spatial flow CNN model and a temporal flow CNN model for training and testing;
and performing weighted fusion on the loss layer outputs of the space flow CNN model and the time flow CNN model to obtain a behavior recognition result of the double-flow convolutional neural network model.
The invention has the beneficial effects that:
① the invention designs non-local double-current convolution nerve network model training, compared with the original double-current convolution nerve network, NL-CNN can extract more comprehensive and clear characteristic graph, thus better training effect can be obtained, adding non-local module (NL block) in general CNN can deepen the depth of network to a certain extent, effectively relieve network overfitting, but more importantly, the non-local module can extract the non-local characteristic of sample, and can also perform denoising treatment to the input characteristic graph, which can better solve the problem of low recognition accuracy rate caused by complex background environment, various human behaviors, large action similarity and other reasons in the behavior video.
② when classifying in the softmax layer, the invention uses A-softmax loss function to not only add m times limit to the classification angle on the basis of the original softmax function, but also make two limits of 1 and 0 for the weight W and bias b of the full connection layer on the A-softmax layer, thus achieving the effect of enlarging the distance between different classes and reducing the distance between the same classes, making the distance between the classes of the sample larger and the distance within the classes smaller, and obtaining better classification and identification effect.
Drawings
FIG. 1 is a diagram of an overall non-local dual-stream convolutional neural network architecture.
Fig. 2 is an NL block configuration diagram.
Fig. 3a and fig. 3b are NL-CNN structure diagrams of CNN and NLblock combination, where fig. 3a shows that a common CNN is connected to NL block, and fig. 3b shows that a residual network is connected to NL block.
Fig. 4 is a schematic diagram of softmax classification.
Fig. 5a and 5b are a-softmax geometric schematic diagrams, corresponding to 2D and 3D hyper-spherical manifold (hyperspheremnifold), respectively.
Fig. 6a, 6b show one classification process and the result for a-softmax for two classifications.
Detailed Description
The invention provides a human behavior recognition method of a non-local double-current convolutional neural network model, wherein the whole network structure adopts the double-current convolutional neural network model, and figure 1 schematically illustrates that a network framework mainly comprises a space flow and time flow CNN model and is used for extracting space appearance information and time motion information of a video sample. The structure and the arrangement of each convolution layer and the full connection layer are the same and weight parameters are shared.
In fig. 1, an input video sample set is preprocessed to obtain RGB frames and optical flow images (single RGB frame and continuous optical flow frame), which are divided into a training set and a testing set, and sent to a spatial stream CNN and a temporal stream CNN for training and testing, respectively. And selecting a proper size for inputting the RGB frame and the optical flow image, and selecting a proper network model as a network structure of the double-flow CNN model to further extract the characteristics for training.
Generally, a CNN network mainly includes a convolutional layer (Conv), a pooling layer (Pool), a full link layer (FC), a lossy layer, etc., and features are extracted from the convolutional layer in a model, the pooling layer performs a dimension reduction process to obtain a specific feature region, and then local information extracted from the convolutional layer is integrated in the full link layer to obtain global information, but this may bring a lot of parameters to the network. In order to solve the problem and enable the deep network to better fuse non-local information, a non-local neural network module (NL block) is added before a convolutional layer of the CNN to extract non-local features of a sample, a feature denoising effect is achieved, and finally classification and normalization are carried out on a Softmax layer. And performing weighted fusion on the softmax outputs of the spatial stream CNN and the time stream CNN to obtain a final behavior recognition result of the double-stream CNN model.
The invention also adopts an A-softmax loss function to train in the final loss layer of the network, thereby obtaining better identification precision.
The non-local feature extraction module (NL block) processes the relation between local features and full-image feature points by a non-local mean method according to the application of non-local means in image denoising, and establishes the relation between two pixels with a certain distance on an image.
Fig. 2 is an NL block architecture diagram. In the figure, h and d represent the size, length and width of the feature map. The NL block abstraction is described as:
Figure BDA0002255958040000041
where s denotes an input signal (characteristic diagram) and y denotes an output signal, which has the same magnitude as s. C(s) is a numerical value for normalization;
Figure BDA0002255958040000042
for calculating pixel pointssiAnd all pixels sjCorrelation between, g(s)j)=WgsjCalculating the characteristic value of the input signal at the j position, and finally outputting NL block with zi=Wzyi+si. W hereinθ
Figure BDA0002255958040000043
Wg,WzIs a learnable weight matrix, actually realized by convolution of 1 × 1.
According to the method, a traditional CNN model is effectively connected with an NL block, a data sample is sent into the NL block after a feature diagram extracted from a convolutional layer of the CNN is subjected to pooling layer dimensionality reduction, an original feature diagram and a non-local feature diagram are output through the calculation operation, and then the data sample is sent into the next convolutional layer for operation, wherein a combined diagram of the CNN and the NL block is shown in figures 3a and 3 b.
Principle of loss function:
fig. 4 shows a principle diagram of softmax classification. The neural network comprises an Input layer (Input), then the Input layer is processed by two feature layers (Features I and II), finally the probability under different conditions can be obtained by a softmax analyzer, the probability is divided into three categories, and finally the probability values of y being 0, y being 1 and y being 2 are obtained.
A-softmax Loss principle:
the A-softmax Loss function expression is as follows:
Figure BDA0002255958040000044
wherein psi (theta)yi,i)=(-1)kcos(mθyi,i)-2k,
Figure BDA0002255958040000045
k∈[0,m-1]。
θj,iDenotes xiOffset W from all other classesjThe included angle between them;
θyi,idenotes xiAnd category yiOffset W ofyiThe included angle between j belongs to [1, K ]]And K is the total number of categories.
The A-softmax loss function not only adds m-times limit to the classification angle on the basis of the original softmax function, but also makes two limits of 1 and 0 for the weight W and the offset b of the fully-connected layer on the upper layer of the A-softmax.
Wherein m > is 2, and m belongs to N; the larger the value of m is, the stronger the distinctiveness of the learned features is, but the learning difficulty is higher, so that the most appropriate numerical value can be selected through limited experiments; preferably, m is 4, which is the best effect.
From this point on, the classification process of a-Softmax depends only on the angle between W and x. See the schematic geometry of a-softmax as shown in fig. 5a, 5 b. x represents a sample, is x in the formulaiIs a general term for (1).
Taking the two-dimension classification as an example, the a-softmax loss has two separated decision planes in both 2-dimensional plane and 3-dimensional space, and the size of the separation of the decision planes is positively correlated with the size of m, and the decision plane is a plane in euclidean space, which can separate different classes. It is assumed that one sample x of class 1 and the weights W of the two classes1,W2,θ1And theta2Represents x samples and weight W1,W2The included angle of (a). The criterion for the classification of A-softmax is cos (m θ)1)>cos(θ2)cos(mθ1)>cos(θ2) Equivalent to m θ12. In a unit sphere, θ1And theta2Equivalent to the length ω of their corresponding arcs1And ω2. Theta in the comparative graph1And theta2X belongs to the smaller class.
Fig. 6a and 6b show a classification process and a result of a-softmax for two classes, where fig. 6a shows that a-softmax uses weight normalization to make two classes in two unit circles, and fig. 6b shows that the two classes are divided by an angle interval, so that the present invention is more accurate in final classification of behavior recognition, and the obtained accuracy of behavior recognition is higher. Margin in fig. 6b represents the interval between class 1 and class 2, and the larger m, the larger the interval between classes in the training process, and the more difficult learning.
The CNN model can be trained by networks such as VGGnet-16, Resnet-50 and GoogleNet, and can be tested on a plurality of public databases (such as a KTH data set, a UCF101 data set and a HMDB51 data set). The present invention uses the deep learning tensor library pytorreh to implement the proposed method. In order to verify the effectiveness of the method, a VGGnet-16 network is exemplarily selected to perform experiments on a UCF101 data set, and the hardware configuration used for the experiments is as follows, namely a GPU, NvidiaGeForceGTX 1080Ti (video memory: 12GB), a memory, 128GB and a CPU, an Intel Core eight-Core i7 processor (main frequency 3.60 GHz). The experiment further proves that the method of the invention improves the recognition accuracy of human behavior recognition by 1.1% under the same condition.
In summary, the present invention improves Two branch networks based on the original double-stream convolutional neural network (Two-stream CNN) model, designs a non-local CNN model, and adds a non-local block (NLblock for short) before the convolutional layers of the spatial stream CNN and the temporal stream CNN, so as to solve the defect that the CNN convolutional layers can only extract local features, and global features can only depend on the last fully-connected layer for fusion, which results in an excessive parameter amount, and also play a role in feature denoising in the network, so that the deep network can better fuse non-local information.
The neural network collects the characteristics of the samples at the last layer of the fully-connected layer after the characteristics of the samples are extracted at the convolutional layer so as to obtain characteristic vectors, and then the characteristic vectors are sent to the softmax layer to realize classification. The conventional softmax loss function is easy to optimize because it does not have the effect of maximizing the distance between different classes and minimizing the distance between the same classes. The invention adopts an A-softmax loss function to add m times of limit to the classification angle on the basis of the original softmax function, and also makes two limits of 1 and 0 for the weight W and the offset b of a full connecting layer on the A-softmax, so that the inter-class distance of the sample is larger and the intra-class distance is smaller. And finally, a deep learning model with stronger discrimination capability can be obtained.
While the present invention has been described in detail with reference to the preferred embodiments, it should be understood that the above description should not be taken as limiting the invention. Various modifications and alterations to this invention will become apparent to those skilled in the art upon reading the foregoing description. Accordingly, the scope of the invention should be determined from the following claims.

Claims (6)

1. A human behavior recognition method of a non-local double-current convolution neural network model is characterized in that,
training data samples using a network structure incorporating a CNN model and a non-local neural network module: extracting features and dimensionality reduction of a pooling layer through a convolutional layer during each processing except the last processing, performing non-local feature extraction and feature denoising through the non-local neural network module, and then performing next convolutional layer and pooling layer processing; after the last convolution layer and the last pooling layer are processed, feature vectors are obtained by summarizing the features of the data samples through the full-connection layer, and then classification and normalization are carried out through the loss layer.
2. The human behavior recognition method according to claim 1,
and an A-softmax loss function is adopted in the loss layer, m-times limitation is carried out on the classification angle, and the limitations of 1 and 0 are carried out on the weight W and the offset b of the all-connection layer.
3. The human behavior recognition method according to claim 2,
the m multiple of the restricted classification angle is m > 2, and m belongs to N.
4. The human behavior recognition method according to claim 1 or 2,
extracting spatial appearance information and temporal motion information of a video sample by adopting a double-current convolutional neural network model through a spatial current CNN model and a temporal current CNN model;
an input video sample set is preprocessed to obtain RGB frames and optical flow images, the RGB frames and the optical flow images are divided into a training set and a testing set, and the training set and the testing set are respectively sent to a spatial flow CNN model and a temporal flow CNN model for training and testing;
and performing weighted fusion on the loss layer outputs of the space flow CNN model and the time flow CNN model to obtain a behavior recognition result of the double-flow convolutional neural network model.
5. The human behavior recognition method according to claim 1,
the local feature extraction module abstracts and describes as follows according to the definition of a non-local mean value:
Figure FDA0002255958030000011
wherein s represents a characteristic diagram as an input signal, y represents an output signal, and the magnitude thereof is the same as s;
c(s) is a numerical value for normalization;for calculating a pixel point siAnd a pixel point sjThe correlation between them;
g(sj)=Wgsjused for calculating the characteristic value of the input signal at the j position;
the output of the local feature extraction module is zi=Wzyi+si
Wθ
Figure FDA0002255958030000023
Wg,WzIs a learnable weight matrix, implemented by convolution of 1 × 1.
6. The human behavior recognition method according to claim 2,
the expression of the A-softmax loss function is as follows:
Figure FDA0002255958030000021
wherein psi (theta)yi,i)=(-1)kcos(mθyi,i)-2k,
Figure FDA0002255958030000022
k∈[0,m-1]
θj,iDenotes xiOffset W from all other classesjThe included angle between them;
θyi,idenotes xiAnd category yiOffset W ofyiThe included angle between j belongs to [1, K ]]And K is the total number of categories.
CN201911053686.XA 2019-10-31 2019-10-31 Human body behavior identification method of non-local double-current convolutional neural network model Pending CN110826462A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911053686.XA CN110826462A (en) 2019-10-31 2019-10-31 Human body behavior identification method of non-local double-current convolutional neural network model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911053686.XA CN110826462A (en) 2019-10-31 2019-10-31 Human body behavior identification method of non-local double-current convolutional neural network model

Publications (1)

Publication Number Publication Date
CN110826462A true CN110826462A (en) 2020-02-21

Family

ID=69551797

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911053686.XA Pending CN110826462A (en) 2019-10-31 2019-10-31 Human body behavior identification method of non-local double-current convolutional neural network model

Country Status (1)

Country Link
CN (1) CN110826462A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626197A (en) * 2020-05-27 2020-09-04 陕西理工大学 Human behavior recognition network model and recognition method
CN111666813A (en) * 2020-04-29 2020-09-15 浙江工业大学 Subcutaneous sweat gland extraction method based on three-dimensional convolutional neural network of non-local information
CN112487229A (en) * 2020-11-27 2021-03-12 北京邮电大学 Fine-grained image classification method and system and prediction model training method
CN112733953A (en) * 2021-01-19 2021-04-30 福州大学 Lung CT image arteriovenous vessel separation method based on Non-local CNN-GCN and topological subgraph
CN112949460A (en) * 2021-02-26 2021-06-11 陕西理工大学 Human body behavior network model based on video and identification method
CN113674753A (en) * 2021-08-11 2021-11-19 河南理工大学 New speech enhancement method
US20220058396A1 (en) * 2019-11-19 2022-02-24 Tencent Technology (Shenzhen) Company Limited Video Classification Model Construction Method and Apparatus, Video Classification Method and Apparatus, Device, and Medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109711277A (en) * 2018-12-07 2019-05-03 中国科学院自动化研究所 Behavioural characteristic extracting method, system, device based on space-time frequency domain blended learning
CN110334589A (en) * 2019-05-23 2019-10-15 中国地质大学(武汉) A kind of action identification method of the high timing 3D neural network based on empty convolution

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109711277A (en) * 2018-12-07 2019-05-03 中国科学院自动化研究所 Behavioural characteristic extracting method, system, device based on space-time frequency domain blended learning
CN110334589A (en) * 2019-05-23 2019-10-15 中国地质大学(武汉) A kind of action identification method of the high timing 3D neural network based on empty convolution

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CHRISTOPH FEICHTENHOFER 等: "Convolutional Two-Stream Network Fusion for Video Action Recognition", 《ARXIV:1604.06573V2 [CS.CV]》 *
WEIYANG LIU 等: "SphereFace: Deep Hypersphere Embedding for Face Recognition", 《2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
XIAOLONG WANG 等: "Non-local Neural Networks", 《COMPUTER VISION FOUNDATION》 *
林佳月: "基于深度学习的行为识别及定位研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220058396A1 (en) * 2019-11-19 2022-02-24 Tencent Technology (Shenzhen) Company Limited Video Classification Model Construction Method and Apparatus, Video Classification Method and Apparatus, Device, and Medium
US11967152B2 (en) * 2019-11-19 2024-04-23 Tencent Technology (Shenzhen) Company Limited Video classification model construction method and apparatus, video classification method and apparatus, device, and medium
CN111666813A (en) * 2020-04-29 2020-09-15 浙江工业大学 Subcutaneous sweat gland extraction method based on three-dimensional convolutional neural network of non-local information
CN111666813B (en) * 2020-04-29 2023-06-30 浙江工业大学 Subcutaneous sweat gland extraction method of three-dimensional convolutional neural network based on non-local information
CN111626197A (en) * 2020-05-27 2020-09-04 陕西理工大学 Human behavior recognition network model and recognition method
CN111626197B (en) * 2020-05-27 2023-03-10 陕西理工大学 Recognition method based on human behavior recognition network model
CN112487229A (en) * 2020-11-27 2021-03-12 北京邮电大学 Fine-grained image classification method and system and prediction model training method
CN112733953A (en) * 2021-01-19 2021-04-30 福州大学 Lung CT image arteriovenous vessel separation method based on Non-local CNN-GCN and topological subgraph
CN112949460A (en) * 2021-02-26 2021-06-11 陕西理工大学 Human body behavior network model based on video and identification method
CN112949460B (en) * 2021-02-26 2024-02-13 陕西理工大学 Human behavior network model based on video and identification method
CN113674753A (en) * 2021-08-11 2021-11-19 河南理工大学 New speech enhancement method

Similar Documents

Publication Publication Date Title
CN110826462A (en) Human body behavior identification method of non-local double-current convolutional neural network model
CN110135366B (en) Shielded pedestrian re-identification method based on multi-scale generation countermeasure network
Jiao et al. New generation deep learning for video object detection: A survey
Du et al. Skeleton based action recognition with convolutional neural network
CN110188239B (en) Double-current video classification method and device based on cross-mode attention mechanism
CN111612008B (en) Image segmentation method based on convolution network
CN110826389B (en) Gait recognition method based on attention 3D frequency convolution neural network
CN110082821B (en) Label-frame-free microseism signal detection method and device
CN113762138B (en) Identification method, device, computer equipment and storage medium for fake face pictures
Guo et al. JointPruning: Pruning networks along multiple dimensions for efficient point cloud processing
CN111145145B (en) Image surface defect detection method based on MobileNet
CN112905828B (en) Image retriever, database and retrieval method combining significant features
Tang et al. Integrated feature pyramid network with feature aggregation for traffic sign detection
Liu et al. Attentive cross-modal fusion network for RGB-D saliency detection
CN110046544A (en) Digital gesture identification method based on convolutional neural networks
CN106529441B (en) Depth motion figure Human bodys' response method based on smeared out boundary fragment
Wu et al. Pose-aware multi-feature fusion network for driver distraction recognition
CN114492634A (en) Fine-grained equipment image classification and identification method and system
CN114283326A (en) Underwater target re-identification method combining local perception and high-order feature reconstruction
CN113239866B (en) Face recognition method and system based on space-time feature fusion and sample attention enhancement
CN114782979A (en) Training method and device for pedestrian re-recognition model, storage medium and terminal
CN113850182A (en) Action identification method based on DAMR-3 DNet
CN117333908A (en) Cross-modal pedestrian re-recognition method based on attitude feature alignment
CN113887509B (en) Rapid multi-modal video face recognition method based on image set
Hassan et al. Enhanced dynamic sign language recognition using slowfast networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200221

RJ01 Rejection of invention patent application after publication