CN108229442B - Method for rapidly and stably detecting human face in image sequence based on MS-KCF - Google Patents

Method for rapidly and stably detecting human face in image sequence based on MS-KCF Download PDF

Info

Publication number
CN108229442B
CN108229442B CN201810124952.2A CN201810124952A CN108229442B CN 108229442 B CN108229442 B CN 108229442B CN 201810124952 A CN201810124952 A CN 201810124952A CN 108229442 B CN108229442 B CN 108229442B
Authority
CN
China
Prior art keywords
detection
network
kcf
convolution
image sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810124952.2A
Other languages
Chinese (zh)
Other versions
CN108229442A (en
Inventor
李小霞
李旻择
叶远征
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest University of Science and Technology
Original Assignee
Southwest University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest University of Science and Technology filed Critical Southwest University of Science and Technology
Priority to CN201810124952.2A priority Critical patent/CN108229442B/en
Publication of CN108229442A publication Critical patent/CN108229442A/en
Application granted granted Critical
Publication of CN108229442B publication Critical patent/CN108229442B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/207Analysis of motion for motion estimation over a hierarchy of resolutions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a method for rapidly and stably detecting a human face in an image sequence based on MS-KCF. Aiming at the problem of face Detection with large angle change and serious shielding in an image sequence, the invention provides a novel automatic Detection-Tracking-Detection (DTD) mode which integrates a rapid and accurate target Detection model MobileNet-SSD (MS) and a rapid Tracking model Kernel Correlation Filters (KCF), namely an MS-KCF face Detection model. The method comprises the following steps: step 1, building an MS detection network; step 2, detecting the target by using the MS network; step 3, updating the tracking model to predict the position of the next frame of human face target; step 4, after tracking the number of frames, updating the MS detection network, and re-detecting and positioning the face target; and 5, comparing and analyzing the experimental results. Experiments show that the MS-KCF model not only ensures the stability of face detection with large angle change and serious shielding in an image sequence, but also greatly improves the detection speed.

Description

Method for rapidly and stably detecting human face in image sequence based on MS-KCF
Technical Field
The invention belongs to the technical field of target detection of machine vision, and particularly relates to a method for rapidly and stably detecting a human face in an image sequence based on MS-KCF.
Background
With the continuous development of computer technology, the performance of computers is continuously improved, and the face detection technology has made a great breakthrough as an important branch in the field of computer vision, and nowadays, face detection has been widely applied to access control systems, intelligent monitoring, intelligent cameras and the like. The face detection is also a challenging technology, and has become a problem to be solved urgently in application for detecting how to stably detect a face with large angle change and serious shielding in an image sequence in real time. At present, the traditional method using shallow features has not met the requirements, so a deep Convolutional Neural Network (CNN) is the focus and hot spot of the research of the detection technology today.
The traditional face detection methods are numerous, but all have the following characteristics: firstly, characteristics need to be manually selected, the process is complex, and the quality of a target detection effect depends on the prior knowledge of researchers; secondly, the target is detected in a mode that a window area traverses the image, a plurality of redundant windows exist in the detection process, the time complexity is high, and the human face detection effect on the human face with large angle change and serious shielding in the image sequence is poor.
In recent years, CNN has made a great breakthrough in the field of target detection, and is now the most advanced target detection method. The marked breakthrough of CNN in target detection is that Ross Girshick et al proposed an R-CNN (Region-based CNN) network in 2014, and the mean Average detection Precision (mAP) in VOC is twice that of the HOG-based DPM (Deformable Parts model) target detection algorithm proposed by Felzenszwalb et al. Since the advent of R-CNN, CNN-based target detection has dominated the performance of VOC datasets, largely divided into two main categories: (1) target detection based on a candidate area is characterized by having high detection precision, but the speed cannot meet the application real-time requirement, wherein the target detection represents R-FCN in 2016, Faster R-CNN in 2017, Mask R-CNN in 2017 and the like; (2) target detection based on regression is characterized by high speed but poor detection accuracy, and represents that there are 2015 years of yolo (young Only Look once) and 2016 years of ssd (single Shot multi box detector), etc. Jonathan Huang et al elaborated in 2016 a method of compromise between detection accuracy and speed of meta-structures (SSD, Faster R-CNN and R-FCN). In addition, some cascaded face detection methods also have good effects, for example, a Joint Cascade method proposed by Chen et al in 2014 utilizes face detection and mark point detection of a face to carry out cascading, and has a high detection effect in a traditional face detection method; the MTCNN proposed by Zhang et al in 2016 uses three convolution network cascades, and a coarse-to-fine algorithm structure enables multitask face detection to have high recall rate, but three different data sets are needed for network training, which is complicated; the Faceness network proposed by Yang et al in 2016 judges whether a detected target is a human face or not by using five characteristics, namely, a nose, a mouth, eyes, hair and a beard, has high detection precision, but does not meet the real-time criterion. Deep learning is developing towards embedded devices such as mobile phones, and the number of parameters of a basic network is highly limited in order to meet real-time requirements, so that MobileNet is proposed by Andrew g. Howard et al in 2017, and a large amount of parameters are replaced by small amount of classification accuracy to be reduced. The number of parameters of MoblieNet is 1/33 of VGG16, and the accuracy of the ImageNet-1000 classification is 70.6%, which is only 0.9% less than that of VGG 16. In summary, it is still a difficult point to consider both speed and accuracy in the field of target detection.
Disclosure of Invention
In practical engineering application, most of the human faces in an image sequence are detected, and a system is required to stably detect the human faces with large angle change and serious shielding in real time. Therefore, the fast MobileNet basic network is improved and combined with the fast face detection network SSD model to form an MS (MobileNet-SSD) detection network, which can well give consideration to detection speed and precision, MS network parameters are adjusted to meet the face detection task of two classifications (face target and background), and then a Kernel Correlation Filtering (KCF) algorithm is used for stably tracking the detected face to form a detection-tracking-detection (DTD) cyclic update mode, namely an MS-KCF face detection model. The model not only solves the problems of face detection stability of large angle change and serious shielding, but also can greatly improve the detection speed of a face target in an image sequence.
The technical act of the present invention is as follows: a method for rapidly and stably detecting human faces in an image sequence based on MS-KCF mainly comprises the following steps:
step 1, building an MS (Mobile network-SSD) detection network;
step 2, reading an image sequence, and detecting the image by using an MS detection network;
step 3, updating a tracking model, transmitting the coordinate information of the detected face target to a KCF tracker, taking the KCF tracker as a basic sample frame of the tracker, and carrying out sample sampling and training near the sample frame to predict the position of the face target of the next frame;
step 4, in order to prevent the loss of the human face target during tracking, after tracking for a plurality of frames, updating the MS detection network, and re-detecting and positioning the human face target;
and 5, comparing and analyzing the experimental result with the current advanced face detection method.
Drawings
FIG. 1 is a general flow chart of the system of the present invention
FIG. 2 is a diagram of a MS network architecture in accordance with the present invention
FIG. 3 is a diagram of the improved MobileNet convolution structure of the present invention
FIG. 4 is a pyramid of the convolution characteristics of the MS network of the present invention
FIG. 5 is a graph showing the test results of the MS-KCF model of the present invention
FIG. 6 is a ROC curve comparison graph of Girl image sequences of the present invention
FIG. 7 is a ROC curve comparison of the faceOcc1 image sequence of the present invention.
Detailed Description
The fast face stability detection method in the image sequence based on the MS-KCF of the invention will be further described in detail with reference to the examples and the accompanying drawings.
As shown in FIG. 1, the system of the present invention comprises an image sequence acquisition module, an MS detection network module, a KCF tracking module, and a model update module. Therefore, a new automatic Detection-Tracking-Detection (DTD) cyclic updating mode, namely an MS-KCF face Detection model, is formed in the whole network.
Step 1, an MS (Mobile network-SSD) detection network is built. As shown in fig. 2, the MS detection network structure includes four parts: the first part is an input layer and is used for inputting pictures; the second part is an improved MobileNet convolution network used for extracting the characteristics of an input picture; the third part is an SSD meta structure and is used for classification regression and bounding box regression; the fourth part is an output layer used for outputting the detection result. As shown in table 1, for the overall architecture of the MS detection network, Conv _ BN _ ReLU6 represents the standard convolutional layer, Conv1_ Dw _ Pw represents the depth separable convolutional layer, and 'v' represents the feature map of the convolutional layer output, which will be used in both classification regression and bounding box regression. Since the human face target is small, the feature map output by the Conv7_ Dw _ Pw in the shallow layer is taken.
TABLE 1 MS Overall architecture
Figure 932246DEST_PATH_IMAGE001
The MS detection network comprises two parts of a modified MobileNet convolution network and an SSD meta-structure.
(1) The improved MobileNet convolutional network extracts features. As shown in fig. 3, for a modified MobileNet convolution structure: conv _ Dw _ Pw is depth Separable Convolutions (Depthwise Separable Convolutions), Dw is the deep layer convolution (Depthwise Layers) of 3x3, Pw is the point convolution layer (Pointwise Layers) of 1x1, and each convolution operation is followed by a Batch Normalization (BN) algorithm and an activation function ReLU 6. The invention changes the ReLU of the activating function in the MoblieNet network into the ReLU6, and the convergence speed of the training is accelerated by matching with the BN algorithm for automatically adjusting the data distribution. Equation (1) is the ReLU6 activation function:
Figure 636897DEST_PATH_IMAGE002
(1)
whereinxIs the input to the activation function and,yis the output.
The improved MobileNet convolution structure is skillfully designed.
First, the depth separable convolution structure greatly reduces the amount of computation and speeds up the convergence rate during training for the following reasons:
when performing the calculation of the standard convolution, assume that the size of the input image is
Figure 647578DEST_PATH_IMAGE003
MRepresenting an input imageThe number of the channels of (a) is,Nrepresenting the number of channels of the convolution output, the standard convolution kernel size being
Figure 615534DEST_PATH_IMAGE004
. If the calculation cost is expressed by the parameter number, the calculation cost of the method is as follows:
Figure 118191DEST_PATH_IMAGE005
however, for the depth separable convolution formula in MobileNet, the required size of the depth convolution kernel at the first half Dw stage is the same as the above-mentioned input and output
Figure 829795DEST_PATH_IMAGE006
In the latter half Pw stage, the required size of the point convolution kernel is
Figure 226141DEST_PATH_IMAGE007
The computation cost of the depth separable convolution at this point is:
Figure 364999DEST_PATH_IMAGE008
at the cost of standard convolution calculations
Figure 948427DEST_PATH_IMAGE009
And (4) doubling.
Second, during convolutional neural network training, the distribution of data changes due to each layer of convolution. If the data is distributed at the edge of the activation function, the gradient will disappear, so that the parameters are not updated any more. The BN algorithm adjusts the distribution of data (similar to standard normal distribution) by setting two learnable parameters, and avoids the gradient disappearance phenomenon and complex parameter (learning rate, Dropout proportion and the like) setting in the training process.
(2) And (5) SSD meta structure regression. The SSD network is a regression model, classification regression and boundary box regression are carried out by utilizing the characteristics output by different convolution layers, the contradiction between translation invariance and translation variability is well relieved, and the detection precision and speed are well compromised,namely, the detection speed is improved, and meanwhile, the detection precision is high. For an input of size 300 × 300, the voc2007 dataset was tested in a Titan X GPU hardware environment, with a detection speed of 59fps and an Average detection Precision (mep) of 74.3%. SSD is an end-to-end training model, the total loss function of which is used in trainingLIncluding confidence loss for classification regression and location loss for bounding box regression, defined as:
Figure 853935DEST_PATH_IMAGE010
(2)
in the formula (2)xA feature representing an input;crepresenting a classification confidence;lrepresenting predicted offsets, including translation offsets for center point coordinates and scaling offsets for bounding box width height;ga calibration frame for the actual position of the target;
Figure 839208DEST_PATH_IMAGE011
confidence loss for classification regression;
Figure 680125DEST_PATH_IMAGE012
position loss of bounding box regression;
Figure 750850DEST_PATH_IMAGE013
is a parameter for balancing the two losses;Pindicates the default number of boxes matched whenPAt 0, the total loss will be set to 0.
And 2, reading the image sequence, and detecting the image by using an MS detection network. As shown in fig. 4, the MS network convolution feature pyramid is obtained, in order to meet the translational variability required by the detection task, the present invention obtains two layers of feature maps in the improved MobileNet and four layers of feature maps in the additional standard convolution layer to form the feature map pyramid, performs convolution using different convolution kernels of 3 × 3, and performs classification regression and bounding box regression using the result after convolution as the final feature. The invention takes a picture with the size of 300 × 300 as input, and the default frame number of each feature unit in the six-layer convolution feature map pyramid is respectively 4, 6 and 6. And the convolution kernel parameters of 3x3 size and step size 1 used for different layers and different tasks are all different.
And 3, updating the tracking model, transmitting the coordinate information of the detected face target to a KCF tracker, taking the KCF tracker as a basic sample frame of the tracker, and carrying out sample sampling and training near the sample frame to predict the position of the face target of the next frame.
The problems of large angle change, serious shielding and the like of a face moving in an image sequence can cause the phenomenon of missing detection in the face detection process. The KCF is a rapid target tracking algorithm, so model updating is carried out in the face detection process: and (3) starting a KCF algorithm for continuous and stable tracking when the face is detected by the MS detection network, and updating the target position by using the face detection model again after tracking 10 frames to avoid tracking loss. Thus, the KCF algorithm functions to:
(1) the robustness of face detection in an image sequence to changes such as postures and angles is enhanced;
(2) the method plays a role in connection and acceleration in the DTD model, and greatly improves the detection speed of the whole system.
Is provided with
Figure 804256DEST_PATH_IMAGE014
In order to be an input, the user can select,
Figure 50561DEST_PATH_IMAGE015
for label, then the training sample set is
Figure 531221DEST_PATH_IMAGE016
The number of samples isRThe purpose of regression is to find a mapping relationshipfSo that
Figure 823662DEST_PATH_IMAGE017
A linear regression function of
Figure 211918DEST_PATH_IMAGE018
Wherein
Figure 906204DEST_PATH_IMAGE019
Representing the weight coefficients. Equation (3) is the error function used by the algorithm:
Figure 213558DEST_PATH_IMAGE020
(3)
wherein the coefficients
Figure 258874DEST_PATH_IMAGE021
The method is used for controlling the structural complexity of the system so as to ensure the generalization capability of the classifier. Solving the formula (3) by a least square method to obtain an optimal weight coefficientw
Figure 654083DEST_PATH_IMAGE022
(4)
In the formula (4)
Figure 734035DEST_PATH_IMAGE023
TThe transpose is represented by,Ithe unit matrix is represented by a matrix of units,Xeach row in the list represents a feature vector. Equation (5) is a complex field form of equation (4):
Figure 822077DEST_PATH_IMAGE024
(5)
wherein
Figure 495635DEST_PATH_IMAGE025
To representXThe complex conjugate transpose matrix. At this point, the solutionwHas a computation time complexity of
Figure 694535DEST_PATH_IMAGE026
In the KCF algorithm, the training sample and the test sample are both composed of basic samples
Figure 97834DEST_PATH_IMAGE027
Produced byThe cyclic matrix is formed by:
Figure 887936DEST_PATH_IMAGE028
(6)
in the formula (6)
Figure 642265DEST_PATH_IMAGE029
Can be obtained by the discrete Fourier matrix in the formula (7)FObtaining:
Figure 379277DEST_PATH_IMAGE030
(7)
Figure 558454DEST_PATH_IMAGE031
(8)
Figure 988299DEST_PATH_IMAGE032
(9)
Figure 495503DEST_PATH_IMAGE033
(10)
in the formula (8), the reaction mixture is,
Figure 36206DEST_PATH_IMAGE034
is a basic sample
Figure 679677DEST_PATH_IMAGE035
In the form of a discrete fourier transform of (a),
Figure 421368DEST_PATH_IMAGE036
to representFThe complex conjugate transpose matrix. In the formula (9), the reaction mixture is,
Figure 150290DEST_PATH_IMAGE037
is composed of
Figure 229104DEST_PATH_IMAGE034
Is a device of Hermite "diag"is a matrix diagonalization operation. Formula (10) is a modification of formula (9), wherein "
Figure 258240DEST_PATH_IMAGE038
"is an element-by-element multiplication operation. After the discrete Fourier transform is simultaneously performed on two sides of the formula (5), according to the formulas (8-10), the obtained result is as follows:
Figure 764308DEST_PATH_IMAGE039
(11)
in the formula (11), the reaction mixture is,
Figure 980526DEST_PATH_IMAGE040
is composed ofYFor discrete Fourier transform of
Figure 518823DEST_PATH_IMAGE041
Then Fourier inversion is carried out to obtainw. In this equation (11)wThe computational time complexity of the solution isO(n)The time complexity of the discrete Fourier transform isO(nlogn)Before, compared withwComplexity of the calculation time of
Figure 871307DEST_PATH_IMAGE042
The time complexity of the whole system is greatly reduced.
The KCF algorithm aims to reduce the computational time complexity of regression through a circulant matrix of the fourier space, thereby achieving a large amount of speed improvement.
And 4, in order to prevent the loss of the human face target during Tracking, after Tracking for 10 frames, updating the MS Detection model, and detecting and positioning the human face target again, so that the whole network forms a new automatic Detection-Tracking-Detection (DTD) mode, namely an MS-KCF human face Detection model, and the whole Detection process takes speed and precision into account.
And 5, comparing and analyzing the experimental result with the current advanced face detection method.
The tests were evaluated on a GTX1080 GPU, with the input pictures all scaled to a size of 300 x 300.
Table 2 shows the average detection rate and average speed comparison for different methods in a standard FDDB static face detection dataset. Table 2 shows that the method has better detection rate, and the detection speed of the MS detection network is 2.8 times faster than MTCNN and 9.3 times faster than Faceness. Therefore, the method has higher detection rate and fast detection speed on the static face detection database.
TABLE 2 average detection Rate and average speed in FDDB dataset with different methods
Figure 813855DEST_PATH_IMAGE043
Fig. 5(a) and 5(b) are the results of testing two image sequences (Girl and FaceOcc 1) in the VOT2016 dynamic face tracking data set, respectively. Girl is a sequence of images in which the angle of the face changes greatly, and FaceOcc1 is a sequence of images in which the occlusion is large. The first two rows of each image sequence in fig. 5(a) and 5(b) are detection results of the MS model, and the last two rows are detection results of the MS-KCF model. Obviously, the MS-KCF model has better detection performance for the face with larger angle change and more serious shielding in the image sequence.
Fig. 6 and 7 are ROC curve comparisons of two image sequences Girl and faceoc 1 in the VOT2016 dataset, respectively. As can be seen from fig. 6 and 7, for the face detection task in the image sequence, the detection performance of the MS-KCF method having the model update function is superior to that of the MS method having only the detection function.
Table 3 mean velocity in VOT2016 dataset versus different methods
Figure 782948DEST_PATH_IMAGE044
Table 3 is a comparison of the average velocities of the different methods in the VOT2016 dataset. As can be seen from Table 3, the MS-KCF method with the model update function is fast, and the detection speed is 2.3 times faster than the MS method with the detection function only, 6.4 times faster than MTCNN, and 21.4 times faster than Faceness.

Claims (4)

1. A method for rapidly and stably detecting human faces in an image sequence based on MS-KCF comprises the following five steps:
step 1, an MS (Mobile network-SSD) detection network is built, and the MS detection network structure comprises four parts: the first part is an input layer and is used for inputting pictures; the second part is an improved MobileNet convolution network used for extracting the characteristics of an input picture; the third part is an SSD meta structure and is used for classification regression and bounding box regression; the fourth part is an output layer and is used for outputting a detection result; improved MobileNet convolution structure: conv _ Dw _ Pw is depth Separable Convolutions (Depthwise Separable Convolutions), Dw is the deep layer convolution layer (Depthwise Layers) of 3x3, Pw is the point convolution layer (Pointwise Layers) of 1x1, and each convolution operation is followed by a Batch Normalization (BN) algorithm and an activation function ReLU 6;
step 2, reading an image sequence, and detecting the image by using an MS network;
step 3, updating a tracking model, transmitting the coordinate information of the detected face target to a Kernel Correlation Filter (KCF) tracker, taking the KCF tracker as a basic sample frame of the tracker, and carrying out sample sampling and training near the sample frame to predict the position of the face target of the next frame;
step 4, in order to prevent the loss of the human face target during tracking, after tracking for a plurality of frames, updating the MS detection network, and re-detecting and positioning the human face target;
and 5, comparing and analyzing the experimental result with the current advanced face detection method.
2. The method as claimed in claim 1, wherein the MS detection network in step 1 replaces the reference network VGG in the original SSD model with the improved fast and accurate MobileNet network, the Pw structure in the original MobileNet changes the distribution of the Dw structure output data, so that the detection accuracy is reduced, the full connection layer of the original MobileNet is omitted, 8 standard convolutional layers are additionally added to expand the perception field of the feature map, adjust the data distribution and enhance the translation invariance required by the classification task, in order to prevent the gradient from disappearing, a Batch Normalization (BN) layer is added after each convolutional layer, and the activation function is changed from ReLU to ReLU 6.
3. The method for rapidly and stably detecting the human face in the image sequence based on the MS-KCF as claimed in claim 1, wherein the MS detection network proposed in the step 2 respectively obtains two layers of feature maps in an improved MobileNet and four layers of feature maps in an additional standard convolution layer to form a feature map pyramid in order to meet the translation variability required by a detection task, convolution is performed by using different convolution kernels of 3x3, and the result after convolution is used as a final feature to perform classification regression and bounding box regression.
4. The method for rapidly and stably detecting the human face in the image sequence based on the MS-KCF as claimed in claim 1, wherein the model updating in the step 3 and the step 4 is performed twice, so as to realize the accurate Detection and positioning of the human face target, thereby forming a new automatic Detection-Tracking-Detection (DTD) cyclic updating mode, namely an MS-KCF human face Detection model, in the whole network, so that the whole Detection process not only ensures the stability of the human face Detection with large angle change and serious shielding in the image sequence, but also greatly improves the Detection speed.
CN201810124952.2A 2018-02-07 2018-02-07 Method for rapidly and stably detecting human face in image sequence based on MS-KCF Active CN108229442B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810124952.2A CN108229442B (en) 2018-02-07 2018-02-07 Method for rapidly and stably detecting human face in image sequence based on MS-KCF

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810124952.2A CN108229442B (en) 2018-02-07 2018-02-07 Method for rapidly and stably detecting human face in image sequence based on MS-KCF

Publications (2)

Publication Number Publication Date
CN108229442A CN108229442A (en) 2018-06-29
CN108229442B true CN108229442B (en) 2022-03-11

Family

ID=62671130

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810124952.2A Active CN108229442B (en) 2018-02-07 2018-02-07 Method for rapidly and stably detecting human face in image sequence based on MS-KCF

Country Status (1)

Country Link
CN (1) CN108229442B (en)

Families Citing this family (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109146967A (en) * 2018-07-09 2019-01-04 上海斐讯数据通信技术有限公司 The localization method and device of target object in image
CN109086678B (en) * 2018-07-09 2022-02-25 天津大学 Pedestrian detection method for extracting image multilevel features based on deep supervised learning
CN109118519A (en) * 2018-07-26 2019-01-01 北京纵目安驰智能科技有限公司 Target Re-ID method, system, terminal and the storage medium of Case-based Reasoning segmentation
CN109271848B (en) * 2018-08-01 2022-04-15 深圳市天阿智能科技有限责任公司 Face detection method, face detection device and storage medium
CN109063666A (en) * 2018-08-14 2018-12-21 电子科技大学 The lightweight face identification method and system of convolution are separated based on depth
CN109034119A (en) * 2018-08-27 2018-12-18 苏州广目信息技术有限公司 A kind of method for detecting human face of the full convolutional neural networks based on optimization
CN109145836B (en) * 2018-08-28 2021-04-16 武汉大学 Ship target video detection method based on deep learning network and Kalman filtering
CN109344731B (en) * 2018-09-10 2022-05-03 电子科技大学 Lightweight face recognition method based on neural network
CN109409210B (en) * 2018-09-11 2020-11-24 苏州飞搜科技有限公司 Face detection method and system based on SSD (solid State disk) framework
CN109558877B (en) * 2018-10-19 2023-03-07 复旦大学 KCF-based offshore target tracking algorithm
CN109492674B (en) * 2018-10-19 2020-11-03 北京京东尚科信息技术有限公司 Generation method and device of SSD (solid State disk) framework for target detection
CN111104817A (en) * 2018-10-25 2020-05-05 中车株洲电力机车研究所有限公司 Fatigue detection method based on deep learning
CN109523476B (en) * 2018-11-02 2022-04-05 武汉烽火众智数字技术有限责任公司 License plate motion blur removing method for video detection
CN109583443B (en) * 2018-11-15 2022-10-18 四川长虹电器股份有限公司 Video content judgment method based on character recognition
CN109472315B (en) * 2018-11-15 2021-09-24 江苏木盟智能科技有限公司 Target detection method and system based on depth separable convolution
CN109598742A (en) * 2018-11-27 2019-04-09 湖北经济学院 A kind of method for tracking target and system based on SSD algorithm
CN109711332B (en) * 2018-12-26 2021-03-26 浙江捷尚视觉科技股份有限公司 Regression algorithm-based face tracking method and application
CN109993052B (en) * 2018-12-26 2021-04-13 上海航天控制技术研究所 Scale-adaptive target tracking method and system under complex scene
CN109754071B (en) * 2018-12-29 2020-05-05 中科寒武纪科技股份有限公司 Activation operation method and device, electronic equipment and readable storage medium
CN109840502B (en) * 2019-01-31 2021-06-15 深兰科技(上海)有限公司 Method and device for target detection based on SSD model
CN111582007A (en) 2019-02-19 2020-08-25 富士通株式会社 Object identification method, device and network
CN109903507A (en) * 2019-03-04 2019-06-18 上海海事大学 A kind of fire disaster intelligent monitor system and method based on deep learning
CN109828251B (en) * 2019-03-07 2022-07-12 中国人民解放军海军航空大学 Radar target identification method based on characteristic pyramid light-weight convolution neural network
CN109978045A (en) * 2019-03-20 2019-07-05 深圳市道通智能航空技术有限公司 A kind of method for tracking target, device and unmanned plane
CN110009015A (en) * 2019-03-25 2019-07-12 西北工业大学 EO-1 hyperion small sample classification method based on lightweight network and semi-supervised clustering
CN110298225A (en) * 2019-03-28 2019-10-01 电子科技大学 A method of blocking the human face five-sense-organ positioning under environment
CN111860046B (en) * 2019-04-26 2022-10-11 四川大学 Facial expression recognition method for improving MobileNet model
CN110287849B (en) * 2019-06-20 2022-01-07 北京工业大学 Lightweight depth network image target detection method suitable for raspberry pi
CN110378239A (en) * 2019-06-25 2019-10-25 江苏大学 A kind of real-time traffic marker detection method based on deep learning
CN110414371A (en) * 2019-07-08 2019-11-05 西南科技大学 A kind of real-time face expression recognition method based on multiple dimensioned nuclear convolution neural network
CN110490899A (en) * 2019-07-11 2019-11-22 东南大学 A kind of real-time detection method of the deformable construction machinery of combining target tracking
CN110580445B (en) * 2019-07-12 2023-02-07 西北工业大学 Face key point detection method based on GIoU and weighted NMS improvement
CN110363137A (en) * 2019-07-12 2019-10-22 创新奇智(广州)科技有限公司 Face datection Optimized model, method, system and its electronic equipment
CN110348423A (en) * 2019-07-19 2019-10-18 西安电子科技大学 A kind of real-time face detection method based on deep learning
CN110647810A (en) * 2019-08-16 2020-01-03 西北大学 Method and device for constructing and identifying radio signal image identification model
CN110619279B (en) * 2019-08-22 2023-03-17 天津大学 Road traffic sign instance segmentation method based on tracking
CN110443247A (en) * 2019-08-22 2019-11-12 中国科学院国家空间科学中心 A kind of unmanned aerial vehicle moving small target real-time detecting system and method
CN110495962A (en) * 2019-08-26 2019-11-26 赫比(上海)家用电器产品有限公司 The method and its toothbrush and equipment of monitoring toothbrush position
CN112487852A (en) * 2019-09-12 2021-03-12 上海齐感电子信息科技有限公司 Face detection method and device for embedded equipment, storage medium and terminal
CN110956082B (en) * 2019-10-17 2023-03-24 江苏科技大学 Face key point detection method and detection system based on deep learning
CN110909688B (en) * 2019-11-26 2020-07-28 南京甄视智能科技有限公司 Face detection small model optimization training method, face detection method and computer system
CN111160269A (en) * 2019-12-30 2020-05-15 广东工业大学 Face key point detection method and device
CN111339832B (en) * 2020-02-03 2023-09-12 中国人民解放军国防科技大学 Face synthetic image detection method and device
CN111339858B (en) * 2020-02-17 2022-07-29 电子科技大学 Oil and gas pipeline marker identification method based on neural network
CN111325157A (en) * 2020-02-24 2020-06-23 高新兴科技集团股份有限公司 Face snapshot method, computer storage medium and electronic device
CN111667505B (en) * 2020-04-30 2023-04-07 北京捷通华声科技股份有限公司 Method and device for tracking fixed object
CN111814827B (en) * 2020-06-08 2024-06-11 湖南腓腓动漫有限责任公司 YOLO-based key point target detection method
CN112581506A (en) * 2020-12-31 2021-03-30 北京澎思科技有限公司 Face tracking method, system and computer readable storage medium
CN112801117B (en) * 2021-02-03 2022-07-12 四川中烟工业有限责任公司 Multi-channel receptive field guided characteristic pyramid small target detection network and detection method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106127776A (en) * 2016-06-28 2016-11-16 北京工业大学 Based on multiple features space-time context robot target identification and motion decision method
CN106204638A (en) * 2016-06-29 2016-12-07 西安电子科技大学 A kind of based on dimension self-adaption with the method for tracking target of taking photo by plane blocking process
WO2017088050A1 (en) * 2015-11-26 2017-06-01 Sportlogiq Inc. Systems and methods for object tracking and localization in videos with adaptive image representation
CN106960446A (en) * 2017-04-01 2017-07-18 广东华中科技大学工业技术研究院 A kind of waterborne target detecting and tracking integral method applied towards unmanned boat
CN107066953A (en) * 2017-03-22 2017-08-18 北京邮电大学 It is a kind of towards the vehicle cab recognition of monitor video, tracking and antidote and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017088050A1 (en) * 2015-11-26 2017-06-01 Sportlogiq Inc. Systems and methods for object tracking and localization in videos with adaptive image representation
CN106127776A (en) * 2016-06-28 2016-11-16 北京工业大学 Based on multiple features space-time context robot target identification and motion decision method
CN106204638A (en) * 2016-06-29 2016-12-07 西安电子科技大学 A kind of based on dimension self-adaption with the method for tracking target of taking photo by plane blocking process
CN107066953A (en) * 2017-03-22 2017-08-18 北京邮电大学 It is a kind of towards the vehicle cab recognition of monitor video, tracking and antidote and device
CN106960446A (en) * 2017-04-01 2017-07-18 广东华中科技大学工业技术研究院 A kind of waterborne target detecting and tracking integral method applied towards unmanned boat

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
High-speed tracking with kernelized correlation filters;J. F. Henriques;《 IEEE Transactions on Pattern Analysis and Machine Intelligence》;20140801;第37卷(第3期);583-596 *
基于MS-KCF模型的图像序列中人脸快速稳定检测;叶远征等;《计算机应用》;20180810;第38卷(第8期);2192-2204 *
基于深度信息的核相关性滤波跟踪算法研究;王杨;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20180115;I138-1602 *

Also Published As

Publication number Publication date
CN108229442A (en) 2018-06-29

Similar Documents

Publication Publication Date Title
CN108229442B (en) Method for rapidly and stably detecting human face in image sequence based on MS-KCF
Collins et al. Infinitesimal plane-based pose estimation
US10755145B2 (en) 3D spatial transformer network
US9305240B2 (en) Motion aligned distance calculations for image comparisons
CN108171133B (en) Dynamic gesture recognition method based on characteristic covariance matrix
CN110276780A (en) A kind of multi-object tracking method, device, electronic equipment and storage medium
US20070053590A1 (en) Image recognition apparatus and its method
Geng et al. Head pose estimation based on multivariate label distribution
US9158963B2 (en) Fitting contours to features
US10990170B2 (en) Eye tracking method, electronic device, and non-transitory computer readable storage medium
US9202138B2 (en) Adjusting a contour by a shape model
CN106127785A (en) Based on manifold ranking and the image significance detection method of random walk
CN112232134A (en) Human body posture estimation method based on hourglass network and attention mechanism
WO2023178951A1 (en) Image analysis method and apparatus, model training method and apparatus, and device, medium and program
CN107067410A (en) A kind of manifold regularization correlation filtering method for tracking target based on augmented sample
Trelinski et al. CNN-based and DTW features for human activity recognition on depth maps
CN114140623A (en) Image feature point extraction method and system
CN114036969A (en) 3D human body action recognition algorithm under multi-view condition
Hempel et al. Pixel-wise motion segmentation for SLAM in dynamic environments
CN112989952B (en) Crowd density estimation method and device based on mask guidance
Chen et al. Pose estimation from multiple cameras based on Sylvester’s equation
CN108053425A (en) A kind of high speed correlation filtering method for tracking target based on multi-channel feature
CN107330382A (en) The single sample face recognition method and device represented based on local convolution characteristic binding
Martı́nez Carrillo et al. A compact and recursive Riemannian motion descriptor for untrimmed activity recognition
Wang et al. Robust object tracking via online principal component–canonical correlation analysis (P3CA)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant