CN108960076B - Ear recognition and tracking method based on convolutional neural network - Google Patents

Ear recognition and tracking method based on convolutional neural network Download PDF

Info

Publication number
CN108960076B
CN108960076B CN201810586771.1A CN201810586771A CN108960076B CN 108960076 B CN108960076 B CN 108960076B CN 201810586771 A CN201810586771 A CN 201810586771A CN 108960076 B CN108960076 B CN 108960076B
Authority
CN
China
Prior art keywords
ear
network
neural network
data set
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810586771.1A
Other languages
Chinese (zh)
Other versions
CN108960076A (en
Inventor
林云智
王雁刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN201810586771.1A priority Critical patent/CN108960076B/en
Publication of CN108960076A publication Critical patent/CN108960076A/en
Application granted granted Critical
Publication of CN108960076B publication Critical patent/CN108960076B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships

Abstract

The invention discloses an ear recognition and tracking method based on a convolutional neural network, which comprises the following steps: constructing a first layer of convolutional neural network aiming at the existing face data set and face frame labels, and detecting the head of a person in an image to obtain a face image containing an ear region; building a second laminated neural network for the ear data set and the ear labeling frame label, and detecting the ear region in the output image in the step 1 through training; and (3) building a third layer neural network aiming at the ear data set and the ear characteristic point label, and automatically labeling the ear characteristic points in the output image in the step (2) through training. The invention adopts a three-layer cascade structure, and can effectively solve the problems of detection and characteristic point marking under the condition that the existing ear data set is relatively small. And the multi-layer network can obviously compress the size of the network, the parameter quantity of the network structure is relatively small, the requirement on the video memory in the training stage is not high, the training is easier to converge, and the performance is better under the complex condition.

Description

Ear recognition and tracking method based on convolutional neural network
Technical Field
The invention belongs to the technical field of computer vision and image processing, relates to an object detection and characteristic point positioning technology, and particularly relates to an ear identification and tracking method based on a convolutional neural network.
Background
Ear recognition and modeling have a very important impact on the realistic rendering of virtual objects. In the field of biometric identification, automatic identification starting from ear images represents an active research area. The ability to capture an ear image from a distance in a covert manner makes this technology an attractive option in surveillance and security applications, as well as other application areas. Compared with traditional biometric identification schemes, such as fingerprint, face and iris recognition techniques, the ear has its unique advantages: the ear has a stable and rich structure, does not change much with age, and is not affected by facial expressions. Recent studies have even empirically verified that certain characteristics of their ears are different even for homozygotic twins.
The ear image can therefore be used to supplement other biometric means in an automatic identification system and to provide an identity hint when other information is not reliable or even available. For example, in surveillance applications, the ear may be the source of the person's identity information in the surveillance lens in situations where face recognition techniques may be contradictory to the facial side.
The basis of ear recognition is ear detection and ear feature selection. In recent years, significant efforts have been made in this field to emerge a number of algorithms based on local coding features, and a number of data sets for training and testing this technology are publicly available, but there are still some unsolved research issues that hinder the wider commercial application of this technology.
First, a well robust ear detection is a cornerstone of the overall recognition system. There are many techniques for automatically extracting ears from 2D face images. Most of these techniques, however, do not perform well when the test images are taken under uncontrolled conditions. Moreover, blocking illumination variations is common in practical applications, which presents a challenging problem and needs to be solved urgently.
Meanwhile, the existing ear recognition technology mainly focuses on extracting and analyzing the geometric features of the ear image, and the limitation is the dependence of the ear image on edge detectors which are sensitive to illumination change and noise. The traditional ear feature point positioning related work positions a plurality of feature points as recognition assistance, and does not meet the requirement of multipoint positioning. Many techniques also require precise ear-specific point locations. Therefore, it becomes urgent to develop an automatic feature labeling method.
A Convolutional Neural Network (CNN) is a feed-forward Neural Network whose artificial neurons can respond to a portion of the coverage of surrounding cells, and performs well for large image processing. There are several key technical issues if CNN is incorporated into the ear recognition and tracking problem:
the first issue is the collection and sorting of data samples. In the last decade, deep learning algorithms have greatly improved the level of computer vision. The performance of visual tasks such as image classification, face recognition, and object detection is significantly improved. The system is driven by data input, has strong robustness, can well respond to various types of challenges, and does not depend on characteristics selected manually by human. Therefore, the invention adopts a deep learning method to identify and track ears. The traditional Adaboost or SVM method has low requirements on the capacity of data samples, the existing data samples can meet the requirements, but deep learning has high requirements on the quantity and accuracy of training samples. At the same time we need to ensure that the ear can still be detected and labeled under multi-angle conditions. This further increases the requirements for data acquisition.
The selection of the ear feature points also has a great influence on the final recognition result, the number of the selected ear feature points is usually too much to reach 48 or 55, and although the subsequent recognition rate is theoretically improved, too high requirements are provided for the workload of feature labeling and the real-time performance of data processing. How to select a proper marking quantity and a corresponding specific marking position under the condition of meeting the purpose of people, the workload of manual marking of the sample is reduced, and the requirement on the real-time property of data processing is weakened to be a great importance.
One of the difficulties of CNN-based ear recognition and tracking is how to extract a high-precision ear part by overcoming adverse effects such as illumination, occlusion, and noise. In the image with a complex background, the ear image is often very complex, the scale change is large, the angle change is large, and if the CNN method is simply applied to the extraction of the ear, the error identification is very easily caused under the conditions that the sample capacity is not sufficient and the sample selection is limited. It is the presence of various disturbances that make it difficult for the ear detection algorithm to reach a realistically usable reason. Secondly, how to design a reliable network structure and overcome the difficulty of ear feature point labeling under low resolution. The ear size is too small relative to the face, and features are densely concentrated in a specific area, so that labeling is extremely difficult.
Disclosure of Invention
In order to solve the problems, the invention discloses an ear identification and tracking method based on a Convolutional Neural Network (CNN), which realizes the tasks of accurate detection and feature point positioning of ears on a video by methods such as data expansion, three-layer deep convolutional network cascade structure construction, pyramid type sliding window network and the like, provides higher accuracy for detection and feature point marking of human ears and has stronger robustness in a complex environment.
Based on the technical key points, the invention provides the following technical scheme:
the ear recognition and tracking method based on the convolutional neural network comprises the following steps:
step 1, building a first layer of convolutional neural network aiming at an existing face data set and a face frame label, and detecting the head of a person in an image to obtain a face image containing an ear region;
step 2, building a second laminated neural network for the ear data set and the ear labeling frame label, and detecting the ear region in the output image in the step 1 through training;
and 3, building a third layer neural network aiming at the ear data set and the ear characteristic point label, and automatically labeling the ear characteristic points in the output image in the step 2 through training.
Further, the step 2 comprises a training part and a detection part,
the training part firstly acquires data and expands a data set, trains the network by utilizing the expanded data set, and obtains the weight of the network by combining a frame regression step;
and the detection part generates a deployment network, reads the network weight obtained by the training part and then detects the ear region in the image output in the step 1.
Further, the process of augmenting the data set includes: obtaining a plurality of positive samples from each original picture by adopting 4 methods of translation, rotation, cutting and scaling; randomly cutting pictures with different sizes in a certain area around the ear to obtain a plurality of negative samples; and the data is expanded by adopting a horizontal overturning method to obtain 2 times of samples.
A fixed length to width ratio is used when obtaining samples in the process of expanding the data set.
Further, the frame regression step specifically includes the following steps: and performing linear regression on the difference value between the predicted coordinate extracted by the computing network and the real mark to finely adjust the obtained frame coordinate, so that the frame coordinate is more accurate.
Further, the detection part specifically comprises the following sub-steps:
preprocessing a picture: zooming the original picture;
a pyramid model: the preprocessed pictures are down sampled through the pyramid model to obtain 9 groups of pictures with different sizes, so that the network can be suitable for detecting the pictures with different sizes;
generating a thermodynamic diagram: the method comprises the steps that 9 pictures generated by a pyramid model are sequentially subjected to sliding window network to obtain corresponding thermodynamic diagrams, and the thermodynamic diagrams are mapped to original images through coordinate proportion conversion;
non-maxima suppression algorithm: and (4) putting all the detection frames obtained by 9 pictures together, searching local maximum values of the detection frame scores in the local area by adopting a non-maximum value inhibition algorithm, and deleting the detection frames below the threshold score.
Further, the step 3 comprises the following sub-steps:
expanding the data set: acquiring a plurality of samples from each original picture by adopting 3 methods of horizontal turning, contrast modification and positive and negative rotation by certain angles, and manufacturing an hdf5 multi-label file after intercepting and zooming;
and detecting by using a third layer network architecture: and (4) adopting a network structure with a convolution layer and a pooling layer in an alternating mode, and finally outputting a result through a full-connection layer.
Further, a Relu activation function is adopted in the third layer network architecture, and the drop layer is utilized to randomly discard the weight with a certain probability.
Compared with the prior art, the invention has the following advantages and beneficial effects:
the invention adopts a three-layer cascade structure, and can effectively solve the problems of detection and feature point marking under the condition that the existing ear data set is relatively small through a mature face detection network of a first layer, an ear detection network of a second layer and an ear feature point marking network of a third layer. And the multi-layer network can obviously compress the size of the network, the parameter quantity of the network structure is relatively small, the requirement on the video memory in the training stage is not high, and the training is easier to converge. Because a data-driven deep learning network is adopted without depending on the traditional contour detection or local feature coding technology, the method has better performance under the complex conditions of multiple angles, multiple scales, shielding and the like.
Drawings
Fig. 1 is a flowchart of an ear recognition and tracking method based on a convolutional neural network according to the present invention.
Fig. 2 is a schematic diagram of a calibration frame for face detection modification in the present invention.
Fig. 3 is a schematic diagram of a deep learning network structure for ear detection according to the present invention.
FIG. 4 is an expanded view of the sample in the ear test of the present invention.
Fig. 5 is a schematic diagram of a deep learning network structure for ear feature point labeling according to the present invention.
Fig. 6 is a diagram of ear feature point labeling effect.
Detailed Description
The technical solutions provided by the present invention will be described in detail with reference to specific examples, which should be understood that the following specific embodiments are only illustrative and not limiting the scope of the present invention.
The invention provides a three-layer network structure, and the complete processing process is shown in figure 1. The total is divided into three stages. The first stage performs face detection and performs block correction to make it include ear region. And in the second stage, ear detection is carried out on the head local area in the first stage, so that a more accurate ear position is obtained. And the third stage carries out ear characteristic point labeling on the ear region. The three phases comprise a plurality of supervised learning processes. Specifically, the ear recognition and tracking method based on the convolutional neural network provided by the invention comprises the following steps:
firstly, adopting a first-stage convolution neural network to detect the head of a person
We use a sophisticated face detection network to obtain the face and expand the final face calibration box coordinates by modification to include the ear region, as shown in fig. 2.
The invention firstly uses the improved human face detection deep learning network to obtain the head area containing the ear area. This significantly reduces the workload of the next local detection.
Second, a second convolution neural network is constructed according to the existing ear image data set and the frame information label of the ear to extract the ear region in the output image of the first step
The double-task architecture comprises the following steps: the network in the second stage adopts a multitask mode, and mainly comprises the following steps: ear classifier/ear candidate box coordinates regression (as shown in fig. 3).
The step includes a training part and a detection part.
The training part comprises data acquisition and expansion: firstly, because the data set of the existing ear is small, the data set sample needs to be increased by adopting a data expansion method. First, positive and negative samples for the ear classifier are obtained: we adopt 4 methods of translation, rotation, cropping and scaling to obtain 30 positive samples from each original picture (as shown in fig. 4); 60 negative samples were obtained by randomly cropping a certain area of the ear with different size pictures. Meanwhile, in order to enable the training sample to adapt to the sliding window network, a fixed length-width ratio is adopted when the sample is obtained (the length-width ratio of an ear labeling frame in a statistical data set is determined as a median 0.512). Based on the ear candidate box coordinate data set, we also adopt a horizontal flipping method to expand the data, resulting in 2 times of samples.
Frame regression: and performing linear regression on the difference value between the predicted coordinate extracted by the computing network and the real mark to finely adjust the obtained frame coordinate, so that the frame coordinate is more accurate.
We can get the weights of the network through the training part.
The detection part generates a deployment network and reads the network weight obtained by the training part. The method specifically comprises the following steps:
preprocessing the picture: the picture mean is subtracted from each pixel of the original picture and scaled to [0,1 ].
The pyramid model is as follows: the pyramid model is used for sampling the image to be detected (namely, the preprocessed image) downwards to obtain 9 groups of images with different sizes, so that the network can be suitable for detecting the images with different sizes.
Generating a thermodynamic diagram: and (3) sequentially passing 9 pictures generated by the pyramid model through a sliding window network to obtain a corresponding thermodynamic diagram, and mapping the thermodynamic diagram to an original image through coordinate proportion transformation.
Non-maxima suppression algorithm: all the test frames obtained from the 9 pictures (obtained in the step of generating the thermodynamic diagram) are put together. A non-maximum suppression algorithm (NMS) is adopted to search local maximum values of the scores of detection frames in a local area, a certain threshold value is set, and the detection frames below the threshold value score are deleted.
In the step, ear classification judgment information and the key point positions of an ear block diagram are integrated in a unified network architecture to obtain the ear region. By fusing the two pieces of information at the stage of the deep convolutional network, the supervised learning of more information of the ears is completed. We have found that by fusing coordinate information, the ear region detection accuracy can be improved. Compared with the traditional method depending on the ear contour structure, the deep convolutional network can adapt to different ear shapes, and stronger robustness is achieved.
And then, a third layer of deep convolutional network is built to finish feature point labeling, aiming at the problems that data samples are sparse and feature points are concentrated in a small area, data expansion is carried out by utilizing the traditional image processing method such as cutting, rotation and deformation, and then a shallower network structure is designed to prevent overfitting of small sample data while training is converged. The method specifically comprises the following steps:
and thirdly, building a third layer neural network aiming at the ear data set and the ear characteristic point label, and realizing automatic marking of the ear characteristic point in the output image in the step 2 through training.
Data acquisition and expansion: firstly, because the existing ear labeling data set is small, the data expansion method is adopted to increase the data set samples. We adopt 3 methods of horizontal turning, contrast modification and rotation of-5 ° - +5 ° to obtain 8 samples from each original picture, and for the convenience of network input, we adopt 1: 1 aspect ratio (as shown). Finally we scaled it to 96 × 96 size to make hdf5 multi-label file.
Network architecture: a network structure with a convolutional layer and a pooling layer alternated is adopted, and finally, a result is output by a full connection layer. To better converge the network, we use the Relu activation function in the middle of the network and use the dropout layer to randomly drop weights with 50% probability. The overall architecture is shown in fig. 5. The final labeling of the ear feature points through the above three steps is shown in fig. 6.
The technical means disclosed in the invention scheme are not limited to the technical means disclosed in the above embodiments, but also include the technical scheme formed by any combination of the above technical features. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principle of the present invention, and such improvements and modifications are also considered to be within the scope of the present invention.

Claims (4)

1. The ear recognition and tracking method based on the convolutional neural network is characterized by comprising the following steps of:
step 1, building a first layer of convolutional neural network aiming at an existing face data set and a face frame label, and detecting the head of a person in an image to obtain a face image containing an ear region;
step 2, building a second laminated neural network aiming at the ear data set and the ear labeling frame label, and detecting the ear region in the image output in the step 1 through training;
the second laminated neural network adopts a multitask mode and comprises: ear classifier/ear candidate box coordinate regression;
comprises a training part and a detection part;
the training part firstly acquires data and expands a data set, trains the network by utilizing the expanded data set, and obtains the weight of the network by combining a frame regression step;
the detection part generates a deployment network, and detects the ear region in the image output in the step 1 after the network weight obtained by the training part is read; the detection part specifically comprises the following sub-steps:
preprocessing a picture: zooming the original picture;
the pyramid model is as follows: the preprocessed pictures are down sampled through the pyramid model to obtain 9 groups of pictures with different sizes, so that the network can be suitable for detecting the pictures with different sizes;
generating a thermodynamic diagram: 9 pictures generated by the pyramid model are sequentially subjected to a sliding window network to obtain a corresponding thermodynamic diagram, and the thermodynamic diagram is mapped onto an original image through coordinate proportion conversion;
non-maxima suppression algorithm: putting all the detection frames obtained by 9 pictures together, searching a local maximum value of the detection frame scores in a local area by adopting a non-maximum value inhibition algorithm, and deleting the detection frames below a threshold score;
the frame regression step specifically comprises the following processes: the obtained frame coordinate is finely adjusted by performing linear regression on the difference value between the predicted coordinate extracted by the computing network and the real mark, so that the frame coordinate is more accurate;
the second laminated neural network integrates ear classification judgment information and the key point position of the ear block diagram to obtain an ear region;
step 3, building a third layer neural network for the ear data set and the ear characteristic point label, and automatically labeling the ear characteristic points in the output image in the step 2 through training;
the step 3 comprises the following sub-steps:
expanding the data set: acquiring a plurality of samples from each original picture by adopting 3 methods of horizontal turning, contrast modification and positive and negative rotation by certain angles, and manufacturing an hdf5 multi-label file after intercepting and zooming;
and detecting by using a third layer network architecture: and (4) adopting a network structure with a convolution layer and a pooling layer in an alternating mode, and finally outputting a result through a full-connection layer.
2. The convolutional neural network based ear recognition and tracking method of claim 1, wherein the process of expanding the data set comprises: obtaining a plurality of positive samples from each original picture by adopting 4 methods of translation, rotation, cutting and scaling; randomly cutting pictures with different sizes in a certain area around the ear to obtain a plurality of negative samples; and the data is expanded by adopting a horizontal overturning method to obtain 2 times of samples.
3. The convolutional neural network based ear recognition and tracking method of claim 2, wherein a fixed ratio of length to width is used in obtaining samples during the expansion of the data set.
4. The convolutional neural network-based ear recognition and tracking method as claimed in claim 1, wherein a Relu activation function is employed in a third layer network architecture, and a dropout layer is utilized to randomly drop weights with a certain probability.
CN201810586771.1A 2018-06-08 2018-06-08 Ear recognition and tracking method based on convolutional neural network Active CN108960076B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810586771.1A CN108960076B (en) 2018-06-08 2018-06-08 Ear recognition and tracking method based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810586771.1A CN108960076B (en) 2018-06-08 2018-06-08 Ear recognition and tracking method based on convolutional neural network

Publications (2)

Publication Number Publication Date
CN108960076A CN108960076A (en) 2018-12-07
CN108960076B true CN108960076B (en) 2022-07-12

Family

ID=64493464

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810586771.1A Active CN108960076B (en) 2018-06-08 2018-06-08 Ear recognition and tracking method based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN108960076B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109858435B (en) * 2019-01-29 2020-12-01 四川大学 Small panda individual identification method based on face image
CN110110702A (en) * 2019-05-20 2019-08-09 哈尔滨理工大学 It is a kind of that algorithm is evaded based on the unmanned plane for improving ssd target detection network
CN111062248A (en) * 2019-11-08 2020-04-24 宇龙计算机通信科技(深圳)有限公司 Image detection method, device, electronic equipment and medium
CN111260608A (en) * 2020-01-08 2020-06-09 来康科技有限责任公司 Tongue region detection method and system based on deep learning
CN111401211B (en) * 2020-03-11 2023-01-06 山东大学 Iris identification method adopting image augmentation and small sample learning
CN112580462A (en) * 2020-12-11 2021-03-30 深圳市豪恩声学股份有限公司 Feature point selection method, terminal and storage medium
CN113887428B (en) * 2021-09-30 2022-04-19 西安工业大学 Deep learning paired model human ear detection method based on context information

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101551853A (en) * 2008-11-14 2009-10-07 重庆大学 Human ear detection method under complex static color background
CN101673340A (en) * 2009-08-13 2010-03-17 重庆大学 Method for identifying human ear by colligating multi-direction and multi-dimension and BP neural network
CN107316007A (en) * 2017-06-07 2017-11-03 浙江捷尚视觉科技股份有限公司 A kind of monitoring image multiclass object detection and recognition methods based on deep learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8345984B2 (en) * 2010-01-28 2013-01-01 Nec Laboratories America, Inc. 3D convolutional neural networks for automatic human action recognition
CN107748858A (en) * 2017-06-15 2018-03-02 华南理工大学 A kind of multi-pose eye locating method based on concatenated convolutional neutral net

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101551853A (en) * 2008-11-14 2009-10-07 重庆大学 Human ear detection method under complex static color background
CN101673340A (en) * 2009-08-13 2010-03-17 重庆大学 Method for identifying human ear by colligating multi-direction and multi-dimension and BP neural network
CN107316007A (en) * 2017-06-07 2017-11-03 浙江捷尚视觉科技股份有限公司 A kind of monitoring image multiclass object detection and recognition methods based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《基于卷积神经网络的人耳识别研究》;胡颖;《中北大学学报(自然科学版)》;20151231;第36卷(第5期);第597-601页 *

Also Published As

Publication number Publication date
CN108960076A (en) 2018-12-07

Similar Documents

Publication Publication Date Title
CN108960076B (en) Ear recognition and tracking method based on convolutional neural network
Rahmad et al. Comparison of Viola-Jones Haar Cascade classifier and histogram of oriented gradients (HOG) for face detection
CN109344693B (en) Deep learning-based face multi-region fusion expression recognition method
CN108334848B (en) Tiny face recognition method based on generation countermeasure network
CN111401257B (en) Face recognition method based on cosine loss under non-constraint condition
Zhan et al. Face detection using representation learning
CN108520216B (en) Gait image-based identity recognition method
CN107392182B (en) Face acquisition and recognition method and device based on deep learning
Wu et al. A detection system for human abnormal behavior
CN109800643B (en) Identity recognition method for living human face in multiple angles
CN107808376B (en) Hand raising detection method based on deep learning
CN109063626B (en) Dynamic face recognition method and device
KR102132407B1 (en) Method and apparatus for estimating human emotion based on adaptive image recognition using incremental deep learning
CN112784763A (en) Expression recognition method and system based on local and overall feature adaptive fusion
CN115880784A (en) Scenic spot multi-person action behavior monitoring method based on artificial intelligence
CN111639577A (en) Method for detecting human faces of multiple persons and recognizing expressions of multiple persons through monitoring video
Waheed et al. A novel deep learning model for understanding two-person interactions using depth sensors
CN108898623A (en) Method for tracking target and equipment
CN113378649A (en) Identity, position and action recognition method, system, electronic equipment and storage medium
Ardiansyah et al. Systematic literature review: American sign language translator
CN108985216B (en) Pedestrian head detection method based on multivariate logistic regression feature fusion
Jindal et al. Sign Language Detection using Convolutional Neural Network (CNN)
CN110766093A (en) Video target re-identification method based on multi-frame feature fusion
Curran et al. The use of neural networks in real-time face detection
Bai et al. Exploration of computer vision and image processing technology based on OpenCV

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant