CN111488856B - Multimodal 2D and 3D facial expression recognition method based on orthogonal guide learning - Google Patents
Multimodal 2D and 3D facial expression recognition method based on orthogonal guide learning Download PDFInfo
- Publication number
- CN111488856B CN111488856B CN202010347655.1A CN202010347655A CN111488856B CN 111488856 B CN111488856 B CN 111488856B CN 202010347655 A CN202010347655 A CN 202010347655A CN 111488856 B CN111488856 B CN 111488856B
- Authority
- CN
- China
- Prior art keywords
- map
- orthogonal
- feature
- expression recognition
- method based
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 230000008921 facial expression Effects 0.000 title claims abstract description 28
- 239000013598 vector Substances 0.000 claims description 18
- 230000004927 fusion Effects 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 4
- 230000002194 synthesizing effect Effects 0.000 claims 1
- 238000013135 deep learning Methods 0.000 abstract description 6
- 238000010586 diagram Methods 0.000 description 3
- 238000012935 Averaging Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
A multi-modal 2D and 3D facial expression recognition method based on orthogonal guide learning relates to the technical field of computer vision. The method utilizes face point cloud data to generate three attribute maps which are respectively a depth map, a direction map and an elevation map, wherein the depth map, the direction map and the elevation map are combined into a three-channel RGB map, and the RGB map is used as the input of a certain branch in a network, so that the parameter quantity of a model is reduced. By the multi-modal 2D and 3D facial expression recognition method based on orthogonal guide learning, the complexity of a deep learning network is reduced, the redundancy among features extracted from different branches in the network is inhibited, and good economic and social benefits are generated.
Description
Technical Field
The invention relates to the technical field of computer vision, in particular to a multimodal 2D and 3D facial expression recognition method based on orthogonal guide learning.
Background
With the rapid development of deep learning, multimodal 2D and 3D Facial Expression Recognition (FER) are receiving wide attention in the field of computer vision. The methods based on deep learning are that a plurality of 3D attribute graphs are extracted by utilizing 3D point cloud data, the attribute graphs and a 2D face graph are used as input and are respectively sent into each characteristic extraction branch of a CNN network, and finally, the extracted characteristics of each branch are fused to be used as the input of a classifier. However, since the 2D color map and the 3D attribute map are both from the same sample, there may be redundancy in the learned features of each branch, which is not conducive to direct feature fusion, and in addition, one branch is used for each Zhang Shuxing map to extract features, which greatly increases the complexity of the model.
Disclosure of Invention
The invention aims to provide a multimodal 2D and 3D facial expression recognition method based on orthogonal guide learning, aiming at the defects and shortcomings of the prior art, and the method can be used for reducing the complexity of a deep learning network and inhibiting the redundancy among the features extracted from different branches in the network.
In order to achieve the purpose, the invention adopts the following technical scheme: three attribute maps, namely a depth map, a direction map and an elevation map, are generated by using face point cloud data, the depth map, the direction map and the elevation map are combined into a three-channel RGB map, and the RGB map is used as the input of a certain branch in a network, so that the parameter quantity of the model is reduced.
The multi-modal 2D and 3D facial expression recognition method based on orthogonal guide learning introduces an orthogonal module to ensure that the features are orthogonal during feature fusion.
According to the multi-mode 2D and 3D facial expression recognition method based on orthogonal guide learning, the feature extraction part uses two network branches with different structures to respectively extract features of a 2D face graph and a 3D attribute graph, the features are respectively defined as FE2DNet and FE3DNet, the FE2DNet is a VGG network deformation, and the FE3DNet is a Resnet derivation.
In the multimodal 2D and 3D facial expression recognition method based on orthogonal guide learning, a Global Weighted Pooling (GWP) layer is adopted to replace a GAP layer, which is different from a general object, in the facial expression recognition task, images input into a CNN network are aligned through key points, so that in a deep feature map, each pixel represents a certain specific area of the input image and contains fixed semantic information. Important areas such as mouth, nose, eyes, etc. play a crucial role in the correct classification of expressions, and additional attention needs to be paid to semantic information of these areas. Using GAP directly, averaging all pixels directly, the semantic information of these key regions is likely to be ignored.
According to the multi-modal 2D and 3D facial expression recognition method based on orthogonal guide learning, each feature map is provided with a weight map with the same size, the weight in the weight map can be updated by gradient descent, the output feature vector is obtained by calculating the dot product of the feature map and the weight map, and the calculation formula is shown as follows:(wherein x k ,y k ,w k Respectively representing the values of the feature map, the weight map and the corresponding feature vector elements), after training with a large amount of face data, the weight map will focus more on a specific spatial region, and a larger weight in the weight map indicates that the spatial region contributes more to the final classification result.
The multimode 2D and 3D facial expression recognition method based on orthogonal guide learning is characterized in that input images of two channels are 2D gray level images and 3D attribute images from the same face, and feature vectors V extracted by a feature extractor 1 And V 2 There may be redundancy, let V before feature fusion 1 And V 2 Passing through an orthogonal guide module to output a feature vector F 1 And F 2 Orthogonal, removing the redundant part between the two vectors.The orthogonal guide module is composed of a full connection layer and a Relu layer.
The multi-modal 2D and 3D facial expression recognition method based on orthogonal guide learning,
orthogonal steering modules are respectively provided with V 1 And V 2 As input, it is converted by the full connection layer and two orthogonal features F are output 1 And F 2 Designing a quadrature loss function L orth To supervise updating of orthogonal bootstrap module weights to ensure F 1 And F 2 Orthogonality therebetween. L is orth Is defined as follows:wherein θ is F 1 And F 2 The included angle therebetween. When loss function L orth The closer to 0, the closer to 90 degrees the angle θ is represented, and F is 1 And F 2 The more orthogonal, i.e. uncorrelated, there is.
The working principle of the invention is as follows: a multi-mode 2D and 3D face expression recognition method based on orthogonal guide learning utilizes face point cloud data to generate three attribute maps which are a depth map, a direction map and an elevation map respectively, the depth map, the direction map and the elevation map are combined into a three-channel RGB map, an orthogonal module is introduced to ensure that features are orthogonal during feature fusion, and before feature fusion, V is firstly led to 1 And V 2 Passing through an orthogonal guide module to output a feature vector F 1 And F 2 Orthogonal, removing the redundant part between the two vectors.
After the technical scheme is adopted, the invention has the beneficial effects that: by the multi-modal 2D and 3D facial expression recognition method based on orthogonal guide learning, the complexity of a deep learning network is reduced, the redundancy among features extracted from different branches in the network is inhibited, and good economic and social benefits are generated.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic diagram of the network architecture of the present invention and its flow chart;
FIG. 2 is a schematic diagram of the network structure of the FE2DNet and FE3DNet of the present invention;
figure 3 is a schematic flow diagram of the GWP operation structure of the present invention;
fig. 4 is a flow chart of the structure of the orthogonal guide module of the present invention.
Detailed Description
Referring to fig. 1 to 4, the technical solution adopted by the present embodiment is: the method utilizes face point cloud data to generate three attribute maps which are respectively a depth map, a direction map and an elevation map, wherein the depth map, the direction map and the elevation map are combined into a three-channel RGB map, and the RGB map is used as the input of a certain branch in a network, so that the parameter quantity of a model is reduced.
Furthermore, the multimodal 2D and 3D facial expression recognition method based on orthogonal guide learning introduces an orthogonal module to ensure that the features are orthogonal during feature fusion.
Further, in the method for recognizing the multi-modal 2D and 3D facial expressions based on the orthogonal guide learning, the feature extraction part uses two network branches with different structures to respectively extract features of the 2D face map and the 3D attribute map, which are respectively defined as FE2DNet and FE3DNet, where FE2DNet is a variant of the VGG network and FE3DNet is a derivative of Resnet.
Further, in the multimodal 2D and 3D facial expression recognition method based on orthogonal guide learning, a Global Weighted Pooling (GWP) layer is used to replace a GAP layer, which is different from a general object, in the facial expression recognition task, images input to the CNN network are aligned through key points, so that in a deep feature map, each pixel represents a specific area of the input image and contains fixed semantic information. Important areas such as mouth, nose, eyes, etc. play a crucial role in the correct classification of expressions, and additional attention needs to be paid to semantic information of these areas. Using GAP directly, averaging all pixels directly, the semantic information of these key regions is likely to be ignored.
Furthermore, each feature map is provided with a weight map with the same size, the weight in the weight map can be updated by gradient descent, the output feature vector is obtained by calculating the dot product of the feature map and the weight map, and the calculation formula is as follows:(wherein x k ,y k ,w k Respectively representing the values of the feature map, the weight map and the corresponding feature vector elements), after training with a large amount of face data, the weight map will focus more on a specific spatial region, and a larger weight in the weight map indicates that the spatial region contributes more to the final classification result.
Further, according to the multi-modal 2D and 3D facial expression recognition method based on orthogonal guide learning, input images of two channels are both a 2D gray level image and a 3D attribute image of the same face, and a feature vector V extracted by the feature extractor 1 And V 2 There may be redundancy, so V is left before feature fusion takes place 1 And V 2 Passing through an orthogonal guide module to output a feature vector F 1 And F 2 Orthogonal, removing the redundant part between the two vectors. The orthogonal guide module is composed of a full connection layer and a Relu layer.
Further, in the multi-modal 2D and 3D facial expression recognition method based on orthogonal guide learning, the orthogonal guide modules respectively adopt V 1 And V 2 As input, it is converted by the full connection layer and two orthogonal features F are output 1 And F 2 Design a quadrature loss function L orth To supervise updating of orthogonal bootstrap module weights to ensure F 1 And F 2 Orthogonality therebetween. L is orth The formula of (c) is defined as follows:wherein θ is F 1 And F 2 The included angle therebetween. When loss function L orth The closer to 0, the closer to 90 degrees the angle θ is represented, in which case F 1 And F 2 The more orthogonal, i.e. uncorrelated, there is.
The working principle of the invention is as follows: a multi-mode 2D and 3D face expression recognition method based on orthogonal guide learning utilizes face point cloud data to generate three attribute maps which are a depth map, a direction map and an elevation map respectively, the depth map, the direction map and the elevation map are combined into a three-channel RGB map, an orthogonal module is introduced to ensure that features are orthogonal during feature fusion, and before feature fusion, V is firstly led to 1 And V 2 Passing through an orthogonal guide module to output a feature vector F 1 And F 2 Orthogonal, removing the redundant part between the two vectors.
After the technical scheme is adopted, the invention has the beneficial effects that: the multi-modal 2D and 3D facial expression recognition method based on orthogonal guide learning reduces the complexity of a deep learning network and inhibits the redundancy among the features extracted from different branches in the network, thereby generating good economic benefit and social benefit.
The above description is only for the purpose of illustrating the technical solutions of the present invention and not for the purpose of limiting the same, and other modifications or equivalent substitutions made by those skilled in the art to the technical solutions of the present invention should be covered within the scope of the claims of the present invention without departing from the spirit and scope of the technical solutions of the present invention.
Claims (3)
1. A multi-modal 2D and 3D facial expression recognition method based on orthogonal guide learning is characterized in that: the method comprises the steps of generating three attribute maps which are respectively a depth map, a direction map and an elevation map by using face point cloud data, synthesizing the depth map, the direction map and the elevation map into a three-channel RGB map, using the RGB map as the input of a branch in a network, and using two attribute maps as a feature extraction partIn the human face expression recognition task, images input into a CNN network are aligned through key points, each pixel in a deep feature map represents a specific area of an input image, input images of two channels are 2D gray level maps and 3D attribute maps from the same human face, and feature vectors V extracted by a feature extractor are respectively 2D face maps and 3D attribute maps 1 And V 2 Having redundancy, let V before feature fusion 1 And V 2 Passing through an orthogonal guide module to output a feature vector F 1 And F 2 Orthogonal to eliminate redundant part between two vectors, and the orthogonal guide module is composed of a layer of full connection layer and Relu layer, and the orthogonal guide modules are respectively divided into V 1 And V 2 As input, it is converted by the full connection layer and two orthogonal features F are output 1 And F 2 Designing a quadrature loss function L orth To supervise updating of orthogonal bootstrap module weights to ensure F 1 And F 2 Orthogonality therebetween.
2. The multi-modal 2D and 3D facial expression recognition method based on orthogonal guide learning of claim 1, wherein: an orthogonality module is introduced to ensure that the features are orthogonal when the features are fused.
3. The multi-modal 2D and 3D facial expression recognition method based on orthogonal guide learning of claim 1, wherein: each feature map is provided with a weight map with the same size, the weight in the weight map is updated by gradient descent, and the output feature vector is calculated by the dot product of the feature map and the weight map.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010347655.1A CN111488856B (en) | 2020-04-28 | 2020-04-28 | Multimodal 2D and 3D facial expression recognition method based on orthogonal guide learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010347655.1A CN111488856B (en) | 2020-04-28 | 2020-04-28 | Multimodal 2D and 3D facial expression recognition method based on orthogonal guide learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111488856A CN111488856A (en) | 2020-08-04 |
CN111488856B true CN111488856B (en) | 2023-04-18 |
Family
ID=71796623
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010347655.1A Active CN111488856B (en) | 2020-04-28 | 2020-04-28 | Multimodal 2D and 3D facial expression recognition method based on orthogonal guide learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111488856B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112052834B (en) * | 2020-09-29 | 2022-04-08 | 支付宝(杭州)信息技术有限公司 | Face recognition method, device and equipment based on privacy protection |
CN113408462B (en) * | 2021-06-29 | 2023-05-02 | 西南交通大学 | Landslide remote sensing information extraction method based on convolutional neural network and class thermodynamic diagram |
CN113642467B (en) * | 2021-08-16 | 2023-12-01 | 江苏师范大学 | Facial expression recognition method based on improved VGG network model |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102043943A (en) * | 2009-10-23 | 2011-05-04 | 华为技术有限公司 | Method and device for obtaining human face pose parameter |
WO2016110005A1 (en) * | 2015-01-07 | 2016-07-14 | 深圳市唯特视科技有限公司 | Gray level and depth information based multi-layer fusion multi-modal face recognition device and method |
CN106778468A (en) * | 2016-11-14 | 2017-05-31 | 深圳奥比中光科技有限公司 | 3D face identification methods and equipment |
CN107392190A (en) * | 2017-09-07 | 2017-11-24 | 南京信息工程大学 | Color face recognition method based on semi-supervised multi views dictionary learning |
JP2018055470A (en) * | 2016-09-29 | 2018-04-05 | 国立大学法人神戸大学 | Facial expression recognition method, facial expression recognition apparatus, computer program, and advertisement management system |
CN108510573A (en) * | 2018-04-03 | 2018-09-07 | 南京大学 | A method of the multiple views human face three-dimensional model based on deep learning is rebuild |
CN108573284A (en) * | 2018-04-18 | 2018-09-25 | 陕西师范大学 | Deep Learning Face Image Augmentation Method Based on Orthogonal Experimental Analysis |
CN109299702A (en) * | 2018-10-15 | 2019-02-01 | 常州大学 | A method and system for human behavior recognition based on deep space-time map |
CN109344909A (en) * | 2018-10-30 | 2019-02-15 | 咪付(广西)网络技术有限公司 | A kind of personal identification method based on multichannel convolutive neural network |
WO2019196308A1 (en) * | 2018-04-09 | 2019-10-17 | 平安科技(深圳)有限公司 | Device and method for generating face recognition model, and computer-readable storage medium |
CN110516616A (en) * | 2019-08-29 | 2019-11-29 | 河南中原大数据研究院有限公司 | A kind of double authentication face method for anti-counterfeit based on extensive RGB and near-infrared data set |
CN110717916A (en) * | 2019-09-29 | 2020-01-21 | 华中科技大学 | Pulmonary embolism detection system based on convolutional neural network |
CN114638283A (en) * | 2022-02-11 | 2022-06-17 | 华南理工大学 | Orthogonal convolution neural network image identification method based on tensor optimization space |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11216541B2 (en) * | 2018-09-07 | 2022-01-04 | Qualcomm Incorporated | User adaptation for biometric authentication |
CN109815785A (en) * | 2018-12-05 | 2019-05-28 | 四川大学 | A facial emotion recognition method based on two-stream convolutional neural network |
-
2020
- 2020-04-28 CN CN202010347655.1A patent/CN111488856B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102043943A (en) * | 2009-10-23 | 2011-05-04 | 华为技术有限公司 | Method and device for obtaining human face pose parameter |
WO2016110005A1 (en) * | 2015-01-07 | 2016-07-14 | 深圳市唯特视科技有限公司 | Gray level and depth information based multi-layer fusion multi-modal face recognition device and method |
JP2018055470A (en) * | 2016-09-29 | 2018-04-05 | 国立大学法人神戸大学 | Facial expression recognition method, facial expression recognition apparatus, computer program, and advertisement management system |
CN106778468A (en) * | 2016-11-14 | 2017-05-31 | 深圳奥比中光科技有限公司 | 3D face identification methods and equipment |
CN107392190A (en) * | 2017-09-07 | 2017-11-24 | 南京信息工程大学 | Color face recognition method based on semi-supervised multi views dictionary learning |
CN108510573A (en) * | 2018-04-03 | 2018-09-07 | 南京大学 | A method of the multiple views human face three-dimensional model based on deep learning is rebuild |
WO2019196308A1 (en) * | 2018-04-09 | 2019-10-17 | 平安科技(深圳)有限公司 | Device and method for generating face recognition model, and computer-readable storage medium |
CN108573284A (en) * | 2018-04-18 | 2018-09-25 | 陕西师范大学 | Deep Learning Face Image Augmentation Method Based on Orthogonal Experimental Analysis |
CN109299702A (en) * | 2018-10-15 | 2019-02-01 | 常州大学 | A method and system for human behavior recognition based on deep space-time map |
CN109344909A (en) * | 2018-10-30 | 2019-02-15 | 咪付(广西)网络技术有限公司 | A kind of personal identification method based on multichannel convolutive neural network |
CN110516616A (en) * | 2019-08-29 | 2019-11-29 | 河南中原大数据研究院有限公司 | A kind of double authentication face method for anti-counterfeit based on extensive RGB and near-infrared data set |
CN110717916A (en) * | 2019-09-29 | 2020-01-21 | 华中科技大学 | Pulmonary embolism detection system based on convolutional neural network |
CN114638283A (en) * | 2022-02-11 | 2022-06-17 | 华南理工大学 | Orthogonal convolution neural network image identification method based on tensor optimization space |
Non-Patent Citations (6)
Title |
---|
orthogonalization guided feature fusion network for multimodal 2D +3D facial expression recognition;Lin,SS等;《IEEE TRANSACTION ON MULTIMEDIA》;第23卷;第1581-1591页 * |
Towards Reading Beyond Faces for Sparsity-aware 3D/4D Affect Recognition;MuzammilBehzad等;《Neurocomputing》;第458卷;第297-307页 * |
一种基于两步降维和并行特征融合的表情识别方法;杨勇等;《重庆邮电大学学报(自然科学版)》;第27卷(第3期);第377-385页 * |
人脸表情识别关键技术的研究;李宏菲;《中国优秀硕士学位论文全文数据库信息科技辑》(第8期);第I138-1019页 * |
基于卷积神经网络的人脸表情识别研究;李思泉等;《软件导刊》;第17卷(第1期);第28-31页 * |
基于多视图核鉴别相关与正交分析的图像分类;朱震宇;《中国优秀硕士学位论文全文数据库信息科技辑》(第2期);第I138-3605页 * |
Also Published As
Publication number | Publication date |
---|---|
CN111488856A (en) | 2020-08-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112784764B (en) | A method and system for facial expression recognition based on local and global attention mechanism | |
CN112950477B (en) | A High Resolution Salient Object Detection Method Based on Dual Path Processing | |
CN110021051B (en) | A Generative Adversarial Network-based Human Image Generation Method Guided by Text | |
CN111488856B (en) | Multimodal 2D and 3D facial expression recognition method based on orthogonal guide learning | |
CN112800937B (en) | Intelligent face recognition method | |
CN116630608A (en) | Multi-mode target detection method for complex scene | |
Sun et al. | IRDCLNet: Instance segmentation of ship images based on interference reduction and dynamic contour learning in foggy scenes | |
CN113269089A (en) | Real-time gesture recognition method and system based on deep learning | |
CN114049381A (en) | A Siamese Cross-Target Tracking Method Fusing Multi-layer Semantic Information | |
CN112036260B (en) | Expression recognition method and system for multi-scale sub-block aggregation in natural environment | |
CN113591928B (en) | Vehicle re-identification method and system based on multi-view and convolution attention module | |
CN113111751B (en) | Three-dimensional target detection method capable of adaptively fusing visible light and point cloud data | |
CN111985525B (en) | Text recognition method based on multi-mode information fusion processing | |
US12080098B2 (en) | Method and device for training multi-task recognition model and computer-readable storage medium | |
CN110569724A (en) | A Face Alignment Method Based on Residual Hourglass Network | |
Kang et al. | Real-time eye tracking for bare and sunglasses-wearing faces for augmented reality 3D head-up displays | |
CN113888603A (en) | Loop closure detection and visual SLAM method based on optical flow tracking and feature matching | |
CN116311518A (en) | A Hierarchical Human Interaction Detection Method Based on Human Interaction Intent Information | |
Bai et al. | DHRNet: A Dual-Branch Hybrid Reinforcement Network for Semantic Segmentation of Remote Sensing Images | |
CN117809339A (en) | Human body posture estimation method based on deformable convolutional coding network and feature region attention | |
CN114937153B (en) | Neural Network-Based Visual Feature Processing System and Method in Weak Texture Environment | |
CN117315137A (en) | Monocular RGB image gesture reconstruction method and system based on self-supervision learning | |
CN113269068B (en) | A Gesture Recognition Method Based on Multimodal Feature Conditioning and Embedding Representation Enhancement | |
CN117095326A (en) | Weather variability text pedestrian re-identification algorithm based on high-frequency information guidance | |
Mizukami et al. | CUDA implementation of deformable pattern recognition and its application to MNIST handwritten digit database |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |