CN110570450B - Target tracking method based on cascade context-aware framework - Google Patents

Target tracking method based on cascade context-aware framework Download PDF

Info

Publication number
CN110570450B
CN110570450B CN201910882861.XA CN201910882861A CN110570450B CN 110570450 B CN110570450 B CN 110570450B CN 201910882861 A CN201910882861 A CN 201910882861A CN 110570450 B CN110570450 B CN 110570450B
Authority
CN
China
Prior art keywords
context
target
map
aware
framework
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910882861.XA
Other languages
Chinese (zh)
Other versions
CN110570450A (en
Inventor
邬向前
卜巍
马丁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN201910882861.XA priority Critical patent/CN110570450B/en
Publication of CN110570450A publication Critical patent/CN110570450A/en
Application granted granted Critical
Publication of CN110570450B publication Critical patent/CN110570450B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/207Analysis of motion for motion estimation over a hierarchy of resolutions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention discloses a target tracking method based on a cascaded context-aware framework, which provides the cascaded context-aware framework based on two networks, and the cascaded context-aware framework comprises two sub-networks: an image-based context-aware network ICANet and an image-based context-aware network PCANet. The framework models progressively various changes between various targets and their context information. The first network focuses on the most discriminative information between the target and its context and the coarse structure of the target, and the second network focuses on the fine structure information of the target itself. According to the output of the two networks, namely the final context perception graph, a positioning frame of the target can be flexibly generated, and background information such as the target and interferents around the target can be effectively distinguished. The FCA map obtained by the invention can be flexibly embedded into various tracking frameworks.

Description

Target tracking method based on cascade context-aware framework
Technical Field
The invention relates to a target tracking method, in particular to a target tracking method based on a cascaded context-aware framework.
Background
Based on the powerful representation capabilities of Convolutional Neural Networks (CNNs), researchers have proposed a number of trackers based on convolutional neural networks. Among them, most trackers use a rectangular box to mark the position of the target. In this case, the target model will contain more or less context information. Moreover, ignoring context information can have a significant impact on tracking performance. First, learning the target model from a limited spatial region may result in overfitting and is not robust to rapid changes in the appearance of the target. Secondly, the lack of true negative examples will greatly impair the robustness of the tracker to complex backgrounds, especially when similar visual information exists in the target and its context, which will greatly increase the risk of tracking drift phenomena occurring. Third, when the context information is not fully considered, it is difficult for the tracker to effectively handle cases where the target encounters occlusion.
Most of the existing target tracking algorithms only aim at the context information in the target local range, little attention is paid to the context information of the whole input image, so that interferents and background information existing in the whole image range are ignored, and the robustness of the tracking algorithms is influenced.
Disclosure of Invention
In order to reduce the interference of the background to a tracker, pay attention to the context information of each corner of the whole image and solve the problems existing in the existing target tracking algorithm, the invention provides a target tracking method based on a cascade context-aware framework.
The purpose of the invention is realized by the following technical scheme:
a target tracking method based on a cascade context-aware framework comprises the following steps:
step one, constructing a context awareness framework (CAT) based on cascade, wherein the context awareness framework comprises two sub-networks: an image-based context-aware network (ICANet) and an image block-based context-aware network (PCANet), wherein: the input of ICANet is a whole image and is used for capturing background information in the range of the whole input image, and PCANet is used for distinguishing similar interferents in a target local range;
step two, learning an image-level context perception map (ICA map) through ICANet, and capturing the most discriminative features between the target and the surrounding context and the approximate structure information of the target;
step three, learning a context perception map (PCA map) of an image block level through PCANet, and acquiring self-structure information of a target and inhibiting information of an interfering object based on the PCAmap;
step four, after obtaining the ICA map and the PCA map, mapping the pixels of the PCA map to the ICA map so as to obtain a final context perception map (FCA map);
and step five, based on the final context awareness graph (FCA map), obtaining a positioning frame of the target by using two strategies, wherein:
strategy one, using sigmoid for each pixel in an FCA map, then obtaining a binary mask by carrying out binarization on the FCA map (the threshold value is 0.5), and generating a bounding box through an axis-aligned bounding rectangle according to the binary mask;
and secondly, embedding the FCA map into a Bayesian framework, namely calculating the maximum posterior estimation according to the probability that the candidate sample belongs to the target.
Compared with the prior art, the invention has the following advantages:
1. the invention provides a context-aware framework based on the cascade of two networks, which comprises an ICANet and a PCANet. The framework models progressively various changes between various targets and their context information. The first network focuses on the most discriminative information between the target and its context and the coarse structure of the target, and the second network focuses on the fine structure information of the target itself. According to the output of the two networks, namely the final context perception graph, a positioning frame of the target can be flexibly generated, and background information such as the target and interferents around the target can be effectively distinguished.
2. The FCA map obtained by the invention can be flexibly embedded into various tracking frameworks.
Drawings
FIG. 1 is a general flow diagram of the CAT framework proposed by the present invention;
FIG. 2 is an architecture of ICANet;
FIG. 3 is an architecture of a PCANet;
FIG. 4 is a visualization result, (a) a label, (b) a visualization result with no LBoutlay for the FCA map, and (c) a visualization result with LBoutlay added to the FCA map;
FIG. 5 is a table of accuracy and success rates on an OTB100 data set, (a) accuracy rate, (b) success rate;
FIG. 6 is a table of accuracy and success rates on a TC128 dataset, (a) accuracy, (b) success rate;
fig. 7 is a visualization of the CAT tracker proposed by the present invention in a challenging sequence.
Detailed Description
The technical solution of the present invention is further described below with reference to the accompanying drawings, but not limited thereto, and any modification or equivalent replacement of the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention shall be covered by the protection scope of the present invention.
The invention provides a target tracking method based on a cascade context-aware framework, which comprises the following steps:
1. image level context aware network (ICANet)
The present invention recognizes that the loop structure is very important for generating a context image of an object, because it can help the network know the position of the object in consecutive frames. As shown in FIG. 1, the loop structure generates an image-level context awareness map (ICA map) in a loop fashion. The entire network consists of one feature extractor (five convolutional layers conv1-conv5 in VGG-M) and five additional modules, each consisting of convolutional layer, average pooling layer, convolutional LSTM unit and deconvolution layer.
For ICANet, the target and background are treated as a binary classification problem. In most cases, there is contrary information between the target and its context. To capture such opposite information, the present invention proposes to subtract the mean of the features from the features themselves. This average is achieved by an average pooling layer with a kernel size of 3 x 3.
In most cases, the context changes relatively slowly compared to the appearance of the target. Therefore, the present invention selects the LSTM to handle this long term dependence. As shown in FIG. 2, the convolution LSTM cell (pink rectangle) is formed by input gate I t Forgetting door F t Cell state C t And an output gate O t And (4) forming. In the time dimension, the relationship between the gate and the state can be expressed as:
Figure BDA0002206400660000051
wherein, X t Is a feature of contrast layer generation. Cell state C t Will be input to the next LSTM. Hidden output is represented by H t And (4) showing. * Is a convolution operation. W * Is the parameter to be learned. b is a mixture of * Is the bias term. An as a dot product operation. tan h is the tangent operation. The output of the LSTM is concatenated with the inverse signature and sent to the deconvolution layer. Upsampling different sized feature maps to the output after five additional modulesEnter the size. Finally, a convolution layer with kernel size 1 × 1 is concatenated after the last deconvolution to produce a single-channel score map. For the loss function, the invention treats the output as likelihood probability and there will be an imbalance in the distribution of target/background pixels, where a class-balanced cross-entropy loss function is employed for training:
Figure BDA0002206400660000052
where K is the total number of training pixels, Q k Is a Gaussian-shaped tag, P k Is the prediction probability.
2. Image block level context-aware network (PCANet)
The structure of ICANet is based on 2D CNN and convolutional LSTM, which usually focuses on capturing the coarser and long-term time dependence. However, such architectures may lack the ability to represent more refined structural information in local spatiotemporal windows. Furthermore, the output of ICANet is a gaussian shape map, and in some cases, the output cannot describe the exact contour of the target.
Figure 3 shows the network structure of the PCANet. The present invention crops an image block from the current frame, the center of the image block is located in the highest response area of the ICA map. The PCANet consists of the feature extractor (the first three convolutional layers in ICANet) and the remaining three additional modules. Each additional module consists of a convolutional layer for reducing the feature size, an RNN unit for modeling the structure itself, and a deconvolution layer for incrementally increasing the feature to the input size.
PCANet aims to obtain the structure of the target itself. However, the resolution of the target features is low and the target occupies only a small portion of the image. In order to capture the complete structure of the target, a feature map with high resolution needs to be constructed. The present invention meets this need by expanding the receptive field of each activation. To this end, the maximum pooling layers after conv1 and conv2 in the VGG-M network are deleted. After this operation, the output profile of conv3 is four times larger than the profile in the original VGG-M network. This operation enables the extraction of high-resolution features and improves the quality of the constructed structure.
The native structure of the build target is then based on the RNN unit. In each RNN unit, several directed RNNs are used to model the target's own structure, i.e. the topology of the undirected graph is approximated by a combination of some directed graphs. In the RNN unit, the undirected graph is decomposed into four directed graphs, right (G) 1 ) Left (G) 2 ) Upper (G) 3 ) And (G) below 4 ). By executing RNN, hidden state h n (n =1 n And (4) calculating. And the sum of the outputs of all hidden layers is fed to the output layer. This process can be expressed as:
Figure BDA0002206400660000061
wherein, U n ,W n And V n Is corresponding to G n The matrix parameters of (2). b n And c is a bias term.
Figure BDA0002206400660000062
Is that vi vi, precursors thereof. The output of the RNN unit is then input to a deconvolution layer to expand the feature map. Finally, the final output is a single-channel score map, which is the same size as the input.
In order to emphasize the boundary of an object, the invention proposes a boundary loss L Boundary As an auxiliary loss. The total loss includes class balance cross entropy loss and boundary loss L Boundary . To calculate the boundary loss L Boundary First, the boundaries of the prediction and the grountruth need to be extracted. Here, the Sobel filter detects the boundary as a convolution of 3 × 3. Mathematically, the Sobel filter can be expressed as:
Figure BDA0002206400660000071
which encode the horizontal and vertical gradients, respectively. Then, a Sobel filter is constructed by connecting the above filters. L is Boundary From labels q k And predicting p k And calculating the mean square error between pk. The overall loss function for training the proposed PCANet is then calculated by the following formula:
Figure BDA0002206400660000072
fig. 4 is a visualization of PCAmap. As can be seen from fig. 4, PCANet focuses more on the finer structure of the target.
3. Determination of target position
To estimate the position of the target, the final context aware map (FCA map) is constructed by the projection of two results, i.e., the result of PCANet is mapped to the result of ICANet by means of pixel value mapping. Then, two different strategies are considered to generate a rectangular box of the target:
(1) Given the FCA map, we use sigmoid for each pixel at the FCA map. Then, a binary mask is obtained by binarizing the FCA map (threshold value of 0.5). From this binary mask, a bounding box (denoted as Seg) is generated by the axis-aligned bounding rectangle.
(2) FCAmap is embedded in a bayesian framework, i.e. the maximum a posteriori estimate is calculated based on the likelihood that the candidate sample belongs to the target. In order to obtain clearer and more accurate target description, the detailed information (denoted as ICA) of the target is described by using Independent Component Analysis (ICA).
ICA is a method for extracting a desired signal between signal sources under the guidance of a reference. To get a reference, the input frame is first convolved with a laplacian gaussian filter and a boundary map is output. Then, refer to m r The element multiplication is carried out on the FCA map through the boundary map. By giving m r For reference, m s As a signal, the desired signal is represented by the projection space s = w T m s And (4) obtaining. Its goal is to maximize negative entropy J(s):
Figure BDA0002206400660000081
ε(s,m r )≤δ (7);
wherein the content of the first and second substances,
Figure BDA0002206400660000082
is a non-quadratic function, ρ is a constant, ε is a g-uniform variable, ε (·) is a normalization function, E [ ·]Is desirable. The results of the ICA are then input to the appearance model in a bayesian framework. In this framework, the position of the target is denoted l t = (x, y, σ), where x, y, and σ denote the center point coordinates and scale of the rectangular box, respectively. All candidate samples are normalized to a standard size->
Figure BDA0002206400660000083
To do this, the confidence of the r-th candidate sample is determined by summing all pixel values in the heat map:
Figure BDA0002206400660000084
the final position is calculated by:
Figure BDA0002206400660000085
wherein, the optimal appearance state l of the corresponding target in the current frame t
4. Online update
Online update policies play an important role in the tracking process. For ICANet, the input is the entire image. Since ICANet trains over a sequence with a maximum length of 16 frames, the LSTM state is reset after every 16 frames. The state of the LSTM is set to the first forward-passing output, which encodes the information of the tracked object. For PCANet, the network is updated frame by frame using the estimated binary mask.
5. Details of training
For ICANet, weight initialization using VGG-M, and other parameters are initialized randomly using a normal distribution. Here the AdamaOptimizer method is used at 10 -4 The learning rate of (c) is updated. ICANet is trained in two stages. In the first stageThe training duration of ICANet is 300 cycles, the batch size is 16 frames, and the CDnet2014 data set is used. Then, ICANet fine-tunes 200 cycles on the DAVIS2016 dataset with a batch size of 16 frames. For PCANet, the feature extractor was initialized using the first three convolutional layers of ICANet. Use 10 -5 For about 300 iterations. All parameters were fixed throughout the experiment.
6. Results and analysis of the experiments
To evaluate CAT performance, the present invention uses standard evaluation metrics. In the OTB100 dataset, the algorithm was evaluated using the commonly used one-pass evaluation (OPE) and a chart of accuracy and success rate was used as an index. For the accuracy metric, the estimated position is measured within a certain threshold distance from the nominal position. Typically, the threshold distance is set to 20 pixels. The success rate measure is the overlap rate between the prediction bounding box and the grountruth bounding box. Graphs of accuracy and success rates are also used for the TC128 data set. In the VOT2016 dataset, each tracker is evaluated by metrics of accuracy ranking (A), robustness ranking (R), and Expected Average Overlap (EAO).
1. Implementation details
The tracking method proposed by the present invention is implemented using the Matconvnet toolkit and runs on the PCs of the Intel (R) Core (TM) i7-4790K CPU and the NVIDIA Tesla K40c GPU. The input sizes of ICANet and PCANet are 300 × 300 and 100 × 100, respectively. The LSTM layer has 1024 cells. All new layers are initialized using the MSRA initialization method. The label for ICANet was generated using a two-dimensional gaussian function with a peak of 1.0. In PCANet, the state of the target in the first frame is initialized by GrabCut. The dimensions of the hidden layer of the RNN are set to 512,256, and 128. For the Bayesian framework tracking strategy, 600 candidates are generated for each frame using a Gaussian distribution model. The variances of the candidate position parameters are set to {10, 0.01}, respectively.
2. Self-comparative experiment
To verify the effectiveness of the various components in CAT, six CAT variants were designed and evaluated using OTB 100. The grey line represents the variation of ICANet using the Seg strategy. White line shows IC using ICA strategyVariants of ANet. Where the scores for accuracy and success are illustrated in the last two columns of table 1. For the Seg strategy, "ICANet + PCANet" and "ICANet + PCANet + L Boundary "improved by 4% and 4.4% respectively in the measurement of accuracy. For the ICA strategy, "ICANet + PCANet" and "ICANet + PCANet + L Boundary "improved 5.2% and 5.8% respectively in the measurement of accuracy. The results show that the framework proposed by the present invention can improve performance with sufficient consideration of context and boundary information. The invention selects' ICANet + PCANet + ICA + L Boundary "architecture compared with other most advanced trackers on the following 3 public data sets.
TABLE 1
Figure BDA0002206400660000101
3. Experimental results on OTB100 data set
The CAT tracker proposed by the present invention was compared with the most recently released 16 trackers on the OTB100 dataset: siamRPN + +, DSLT, DAT, daSiamRPN, MCPF, TADT, ACT, meta CREST, PTAV, CREST, TRACA, CNN-SVM, BACF, ACFN, cfnet, and UDT. Tracking performance is measured by one-pass evaluation (OPE) based on two metrics: center position error and overlap ratio, the results are shown in fig. 5. According to fig. 5, the cat tracker exhibits competitive performance in this data set. The accuracy and success rate values for OTB100 are 0.909 and 0.697, respectively.
4. Experimental results on the TC128 data set
The invention was experimented with a TC128 data set containing 128 videos and the results compared to the 12 most advanced trackers are shown in fig. 6. In all compared methods, the method of the present invention improved the accuracy score from 0.8073 for the most advanced tracker to 0.8153. Fig. 6 (b) shows the success rate of a total of 128 videos in the TC128 dataset. The CAT tracker performance was superior to the most advanced method with an AUC score of 0.6138. This result verifies the robustness of the CAT proposed by the present invention.
5. Experimental results on VOT2016 dataset
The present invention evaluates the performance of CAT on the (VOT 2016) data set. The VOT2016 report shows that under the EAO index, the advanced index is set to 0.251 and trackers with EAO values exceeding this range are defined as the most advanced. We compared CAT trackers to the 7 most advanced trackers, including ECO, C-COT, stack, MDNet, CREST, siamFC, and ECO-hc. As shown in table 2, the CAT tracker obtained a higher ranking in all the compared trackers.
TABLE 2
Tracker CAT ECO C-COT Staple MDNet CREST SiamFC ECO-hc
EAO 0.332 0.367 0.331 0.295 0.257 0.283 0.235 0.322
A 0.57 0.55 0.54 0.54 0.54 0.51 0.53 0.54
R 0.23 0.20 0.24 0.38 0.34 0.25 0.46 0.30
6. Analysis and discussion
The qualitative results of the CAT tracker proposed by the present invention in the challenging sequence subset are shown in fig. 7. The CAT proposed by the present invention is able to successfully cope with both in-plane and out-of-plane rotations of the target. ICAmap generated by ICANet captures more discriminatory features for separating foreground and background, i.e. it retains the most robust features over a long time span. The loop unit in the PCANet is more robust to the occlusion of the target. In addition, the PCANet of the invention can effectively capture the structural change of the target. Compared to BACF, daSiamRPN, our CAT obtained a better tracking structure in the presence of illumination variations and complex backgrounds. This is because context information of each corner in the entire image is extracted. Meanwhile, the FCA map captures coarse-grained and fine-grained information of the target, and the tracker of the invention has better performance than the BACF in a sequence with a small target size.

Claims (7)

1. A cascade-based context-aware framework target tracking method is characterized by comprising the following steps:
step one, constructing a context awareness framework CAT based on cascade, wherein the CAT comprises two sub-networks: an image-based context-aware network ICANet and an image-based context-aware network PCANet, wherein: the input of ICANet is a whole image and is used for capturing background information in the range of the whole input image, and PCANet is used for distinguishing similar interferents in a target local range;
the ICANet is comprised of a feature extractor and five additional modules, wherein: the feature extractor comprises five convolutional layers in the VGG-M, and each additional module consists of a convolutional layer, an average pooling layer, a convolution LSTM unit and a deconvolution layer;
the PCANet is comprised of a feature extractor and three additional modules, wherein: the feature extractor comprises the first three convolutional layers in ICANet, each additional module consists of a convolutional layer for reducing the feature size, an RNN unit for modeling the structure of itself, and an anti-convolutional layer for progressively increasing the feature to the input size;
step two, learning an image-level context perception map ICA map through ICANet, and capturing the most discriminative features between a target and the surrounding context and the approximate structural information of the target;
step three, learning a PCA map of a block-level context perception map of the image through PCANet, and acquiring self-structure information of the target and inhibiting information of an interfering object based on the PCAmap;
step four, after obtaining the ICA map and the PCA map, mapping the pixels of the PCA map to the ICA map to obtain a final context perception map FCA map;
and step five, obtaining a positioning frame of the target based on the final FCA map.
2. The cascade-based context-aware framework target tracking method according to claim 1, wherein the kernel size of the average pooling layer is 3 x 3.
3. The cascade-based context-aware framework-based target tracking method according to claim 1, wherein the convolution LSTM unit is composed of an input gate I t Door F for forgetting to leave t Cell state C t And an output gate O t The relationship between the gate and the state in the time dimension is expressed as:
Figure FDA0003880628380000021
wherein, X t Is a feature of contrast layer generation, C t Is a cellular state, H t Representing hidden outputs, is a convolution operation, W * For the parameter to be learned, b * For an offset term, the case is a dot product operation, and tanh is a tangent operation.
4. The cascade-based target tracking method of context-aware framework according to claim 1, wherein in the first step, ICANet employs class-balanced cross entropy loss function for training:
Figure FDA0003880628380000022
where K is the total number of training pixels, Q k Is a Gaussian-shaped tag, P k Is the prediction probability.
5. The cascade-based context-aware framework target tracking method according to claim 1, wherein in the first step, the whole loss function for training the PCANet is calculated by the following formula:
Figure FDA0003880628380000031
wherein L is Boundary To the boundary loss, q k For labels, K is the total number of training pixels, P k Is the prediction probability.
6. The cascade-based target tracking method of context-aware framework according to claim 1, wherein in the fifth step, the method for obtaining the location box of the target based on the final FCA map is as follows: sigmoid is used for each pixel at the FCA map, and then a binary mask is obtained by binarizing the FCA map, from which bounding boxes are generated by axis-aligned bounding rectangles.
7. The cascade-based target tracking method of context-aware framework according to claim 1, wherein in the fifth step, the method for obtaining the location box of the target based on the final FCA map is as follows: the FCA map is embedded in a bayesian framework, i.e. the maximum a posteriori estimate is calculated based on the likelihood that the candidate sample belongs to the target.
CN201910882861.XA 2019-09-18 2019-09-18 Target tracking method based on cascade context-aware framework Active CN110570450B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910882861.XA CN110570450B (en) 2019-09-18 2019-09-18 Target tracking method based on cascade context-aware framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910882861.XA CN110570450B (en) 2019-09-18 2019-09-18 Target tracking method based on cascade context-aware framework

Publications (2)

Publication Number Publication Date
CN110570450A CN110570450A (en) 2019-12-13
CN110570450B true CN110570450B (en) 2023-03-24

Family

ID=68780920

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910882861.XA Active CN110570450B (en) 2019-09-18 2019-09-18 Target tracking method based on cascade context-aware framework

Country Status (1)

Country Link
CN (1) CN110570450B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111222470A (en) * 2020-01-09 2020-06-02 北京航空航天大学 Visible light remote sensing image ship detection method based on multivariate Gaussian distribution and PCANet
CN113761976A (en) * 2020-06-04 2021-12-07 华为技术有限公司 Scene semantic analysis method based on global guide selective context network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105447473A (en) * 2015-12-14 2016-03-30 江苏大学 PCANet-CNN-based arbitrary attitude facial expression recognition method
CN107622225A (en) * 2017-07-27 2018-01-23 成都信息工程大学 Face identification method based on independent component analysis network
KR20180069220A (en) * 2016-12-15 2018-06-25 현대자동차주식회사 Algorithm for discrimination of target by using informations from radar
CN110070562A (en) * 2019-04-02 2019-07-30 西北工业大学 A kind of context-sensitive depth targets tracking

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105447473A (en) * 2015-12-14 2016-03-30 江苏大学 PCANet-CNN-based arbitrary attitude facial expression recognition method
KR20180069220A (en) * 2016-12-15 2018-06-25 현대자동차주식회사 Algorithm for discrimination of target by using informations from radar
CN107622225A (en) * 2017-07-27 2018-01-23 成都信息工程大学 Face identification method based on independent component analysis network
CN110070562A (en) * 2019-04-02 2019-07-30 西北工业大学 A kind of context-sensitive depth targets tracking

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种不受光照和阴影影响的图像边缘检测方法;张金霞;《兰州理工大学学报》;20070630;第33卷(第3期);第100-103页 *
基于PCA及ICA的双空间特征提取算法;王卫东 等;《中国图象图形学报》;20081130;第13卷(第11期);第2163-2169页 *

Also Published As

Publication number Publication date
CN110570450A (en) 2019-12-13

Similar Documents

Publication Publication Date Title
CN108416266B (en) Method for rapidly identifying video behaviors by extracting moving object through optical flow
CN108550161B (en) Scale self-adaptive kernel-dependent filtering rapid target tracking method
Han et al. Visual object tracking via sample-based Adaptive Sparse Representation (AdaSR)
US8989442B2 (en) Robust feature fusion for multi-view object tracking
CN111027493B (en) Pedestrian detection method based on deep learning multi-network soft fusion
Jia et al. Visual tracking via adaptive structural local sparse appearance model
CN107633226B (en) Human body motion tracking feature processing method
CN108038435B (en) Feature extraction and target tracking method based on convolutional neural network
CN110738207A (en) character detection method for fusing character area edge information in character image
CN107516316B (en) Method for segmenting static human body image by introducing focusing mechanism into FCN
KR101409810B1 (en) Real-time object tracking method in moving camera by using particle filter
CN114758288A (en) Power distribution network engineering safety control detection method and device
Zhang et al. A swarm intelligence based searching strategy for articulated 3D human body tracking
CN113657560A (en) Weak supervision image semantic segmentation method and system based on node classification
CN110570450B (en) Target tracking method based on cascade context-aware framework
CN105405138A (en) Water surface target tracking method based on saliency detection
Bellavia et al. HarrisZ+: Harris corner selection for next-gen image matching pipelines
CN111738164A (en) Pedestrian detection method based on deep learning
Yu et al. Automatic segmentation of golden pomfret based on fusion of multi-head self-attention and channel-attention mechanism
CN113657225B (en) Target detection method
Juang et al. Moving object recognition by a shape-based neural fuzzy network
Hu et al. Multi-task l0 gradient minimization for visual tracking
CN111986233B (en) Large-scene minimum target remote sensing video tracking method based on feature self-learning
Li et al. Research on hybrid information recognition algorithm and quality of golf swing
Mei et al. Fast template matching in multi-modal image under pixel distribution mapping

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant