CN107491761A - A kind of method for tracking target learnt based on deep learning feature and point to aggregate distance measurement - Google Patents

A kind of method for tracking target learnt based on deep learning feature and point to aggregate distance measurement Download PDF

Info

Publication number
CN107491761A
CN107491761A CN201710730930.6A CN201710730930A CN107491761A CN 107491761 A CN107491761 A CN 107491761A CN 201710730930 A CN201710730930 A CN 201710730930A CN 107491761 A CN107491761 A CN 107491761A
Authority
CN
China
Prior art keywords
sample
target
background
feature
extraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710730930.6A
Other languages
Chinese (zh)
Other versions
CN107491761B (en
Inventor
张盛平
刘鑫丽
齐元凯
张维刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology Weihai
Original Assignee
Harbin Institute of Technology Weihai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology Weihai filed Critical Harbin Institute of Technology Weihai
Priority to CN201710730930.6A priority Critical patent/CN107491761B/en
Publication of CN107491761A publication Critical patent/CN107491761A/en
Application granted granted Critical
Publication of CN107491761B publication Critical patent/CN107491761B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of method for tracking target learnt based on deep learning feature and point to aggregate distance measurement, comprise the following steps:Some target samples and background sample are randomly selected in the start frame of tracking;Target sample feature extraction is carried out to target sample, background sample feature extraction is carried out to background sample;By the target sample feature clustering of extraction into several To Template set, by the background sample feature clustering of extraction into several background template set;By reducing distance between generic sample and increasing the distance between different samples to learn projection matrix;Target candidate collection is carried out to subsequent frame according to Gaussian Profile;The feature of target candidate is extracted, and To Template set, background template set and target candidate are projected into common subspace using projection matrix;Each target candidate is calculated to the distance of all To Template set, apart from score of the sum as each target candidate, final tracking result is the average value of several minimum preceding target candidates of distance.

Description

A kind of target following learnt based on deep learning feature and point to aggregate distance measurement Method
Technical field
It is especially a kind of to be arrived based on deep learning feature and point the present invention relates to image procossing and mode identification technology The method for tracking target of aggregate distance measurement study.
Background technology
Target following is an important research direction of computer vision field, and it is in video monitoring, virtual reality, people The fields such as machine interaction, automatic Pilot have extensive use.At present, discriminate tracking achieves preferable tracking result.Greatly Part discriminate tracking regards target following as a classification task, and target sample and background sample instruction are chosen in the first frame Practice a SVM classifier;For subsequent frame, some target candidates are gathered in each frame, and each target candidate is classified device It is divided into target or background;Candidate with maximum target confidence level is designated as tracking result.When being classified due to SVM classifier only According to a small amount of supporting vector (a small amount of sample as classification boundaries selected from training sample) and in most cases Sample linearly inseparable, this have ignored effect of the remaining sample in assorting process.
The content of the invention
It is an object of the invention to provide it is a kind of based on deep learning feature and point to aggregate distance measurement study target with Track method, extract feature by depth convolutional neural networks and improve the separating capacity of expression aspect, by put to set away from Effect of all training samples in assorting process is given full play to from metric learning.
To achieve the above object, the present invention uses following technical proposals:
A kind of method for tracking target learnt based on deep learning feature and point to aggregate distance measurement, including following step Suddenly:
Some target samples and background sample are randomly selected in the start frame of tracking;
Target sample feature extraction is carried out to target sample, background sample feature extraction is carried out to background sample;
By the target sample feature clustering of extraction into several To Template set, by the background sample feature clustering of extraction Into several background template set;
By reducing distance between generic sample and increasing the distance between different samples to learn projection matrix;
Target candidate collection is carried out to subsequent frame according to Gaussian Profile;
The feature of target candidate is extracted, and is waited To Template set, background template set and target using projection matrix Choosing projects to common subspace;
Each target candidate is calculated to the distance of all To Template set, apart from sum obtaining as each target candidate Point, final tracking result is the average value of several minimum preceding target candidates of distance.
Further, the start frame in tracking randomly selects some target samples and background sample, including:
In start frame according to target sample and the quantity ratio 1 of background sample:10 are sampled, and the target sample is with referring to Determine tracing area to hand over and compare more than 0.7, the background sample is handed over specified tracing area and compared less than 0.5.
Further, it is described that target sample feature extraction is carried out to target sample, it is special that background sample is carried out to background sample Sign extraction, including:
Target sample feature extraction is carried out to target sample using depth convolutional neural networks MDNet and background sample is entered Row background sample feature extraction.
Further, the target sample feature clustering by extraction is into several To Template set, by the back of the body of extraction Scape sample characteristics are clustered into several background template set, including:
K-means clusterings are used for several target sample set to the target sample feature of extraction, to each mesh Mark sample set and distribute a class label;K-means clusterings are used to be carried on the back for several background sample feature of extraction Scape sample set, a class label is distributed to each background sample set.
Further, it is described by reducing distance between generic sample and increasing the distance between different samples to learn to project Matrix, including:
Define object function to be optimized;The optimization aim of the object function includes three:Generic sample and sample The space length of this set in the projected is small, different classes of sample and sample set in the projected far;It is similar The space length of other sample in the projected is small, different classes of sample in the projected far;It is each after projection Dimension importance is consistent;
Use the projection matrix on the projection matrix and popular world of cross-iteration Optimization Method theorem in Euclid space.
The effect provided in the content of the invention is only the effect of embodiment, rather than whole effects that invention is all, above-mentioned A technical scheme in technical scheme has the following advantages that or beneficial effect:
The present invention provides a kind of method for tracking target that aggregate distance measurement study is arrived based on deep learning feature and point, more Traditional-handwork design feature has been mended to distinguish hypodynamic shortcoming and overcome traditional discriminate tracking pair based on SVM The defects of training sample is under-utilized in assorting process.By putting the learning distance metric to set, the present invention can be effective Ground calculates each target candidate to the distance of all target samples so that classification of each target sample to candidate is played Effect, so as to obtain more preferable classification results.
Brief description of the drawings
Fig. 1 is the method for tracking target flow that the present invention is learnt based on deep learning feature and point to aggregate distance measurement Figure.
Embodiment
As shown in figure 1, a kind of method for tracking target learnt based on deep learning feature and point to aggregate distance measurement, bag Include following steps:
S1, in the start frame of tracking randomly select some target samples and background sample;
S2, target sample feature extraction is carried out to target sample, background sample feature extraction is carried out to background sample;
S3, by the target sample feature clustering of extraction into several To Template set, by the background sample feature of extraction It is clustered into several background template set;
S4, by reducing distance between generic sample and increasing the distance between different samples to learn projection matrix;
S5, according to Gaussian Profile to subsequent frame carry out target candidate collection;The average of Gaussian Profile is previous frame target position Put, variance 1;
S6, the feature for extracting target candidate, and projection matrix is used by To Template set, background template set and target Candidate projects to common subspace;
S7, each target candidate is calculated to the distance of all To Template set, apart from sum as each target candidate Score, final tracking result is the average value of several minimum preceding target candidates of distance.
In step S1, in start frame according to target sample and background sample quantity ratio 1:10 and positive sample quantity at 100 Sampled above.Here 500 target samples of random acquisition and 5000 background samples, wherein target sample and specify with Track target area is handed over and compared more than 0.7, and background sample is handed over the tracing area specified and compared less than 0.5, hands over and ratio is two figures As the pixel count that the common factor in region is included divided by the pixel count that their union is included.
In step S2, target sample feature extraction and right is carried out to target sample using depth convolutional neural networks MDNet Background sample carries out background sample feature extraction:Each sample is zoomed into 107x107 sizes and by the picture of each passage Plain value subtracts 128 inputs as depth convolutional neural networks MDNet, depth convolutional neural networks MDNet the 3rd convolutional layer Feature of the output as the sample.
In step S3, K-means clusterings are used for 7 target sample set to the target sample feature of extraction, this In the target sample set of preferably more than 5 fully to capture the diversity of target performance information, to each target sample set Distribute a class label, such as+1 to+7;K-means clusterings are used for 20 background samples to the background sample feature of extraction This set, the background sample set of preferably more than 10 here is fully to capture the diversity of background information, to each background sample This set distributes a class label, such as -1 to -20.
In step S4, by reducing distance between generic sample and increasing the distance between different samples to learn to project square Battle array, including:
Define object function to be optimizedSection 1It is the point x in theorem in Euclid spaceiWith the point S in manifoldjThe distance between keep, i.e., Generic single sample xiWith sample set SjSpace length after projection is small, different classes of single sample xiWith Sample set SjSpace length it is big;, xiRepresent any one of target sample or background sample, SjRepresent target sample Set, any one in background sample set, f () andThe mapping to be learnt is represented, if xiAnd SjIt is same with belonging to One classification (be all target sample or be all background sample), 1 (i, j)=1, otherwise, 1 (i, j)=- 1.
Section 2 (Ge+Gr) be the holding of sample point distance and manifold spatially sample point distance in theorem in Euclid space guarantor Hold, i.e., the space length after the projection of generic single sample is small, the space length after different classes of single sample projection It is big;Space length after generic sample set projection is small, the space length after different classes of sample set projection It is big, wherein d (vi,vj)=exp (‖ vi-vj22), v represents x or S.
Section 3Canonical constraint is represented, that is, each dimension after projecting has Identical importance.
Use the projection matrix on the projection matrix and popular world of cross-iteration Optimization Method theorem in Euclid space.Specifically Ground, order
Kx (xi,xj)=<fx(xi),fx(xj)>, wherein WxAnd WsIt is projection matrix to be solved.According toWithWhereinLx= Bx-Qx, Ls=Bs-Qs.V is made to represent x or S, if sample i is as sample j classifications and is k1(we set k1=1, it can also set For other values) neighbour, then Qv(i, j)=d (vi,vj);If sample i and sample j classifications are different and are k2(we set k2=5, Other values can be set to) neighbour, then Qv(i, j)=- d (vi,vj);Other situations make Qv(i, j)=0.Pass through repeatedly (such as 10 It is secondary) the final W of iteration renewal acquisitionxAnd WsValue.
Although above-mentioned the embodiment of the present invention is described with reference to accompanying drawing, model not is protected to the present invention The limitation enclosed, one of ordinary skill in the art should be understood that on the basis of technical scheme those skilled in the art are not Need to pay various modifications or deformation that creative work can make still within protection scope of the present invention.

Claims (5)

1. a kind of method for tracking target learnt based on deep learning feature and point to aggregate distance measurement, it is characterized in that, including Following steps:
Some target samples and background sample are randomly selected in the start frame of tracking;
Target sample feature extraction is carried out to target sample, background sample feature extraction is carried out to background sample;
By the target sample feature clustering of extraction into several To Template set, by the background sample feature clustering Cheng Ruo of extraction Dry background template set;
By reducing distance between generic sample and increasing the distance between different samples to learn projection matrix;
Target candidate collection is carried out to subsequent frame according to Gaussian Profile;
The feature of target candidate is extracted, and is thrown To Template set, background template set and target candidate using projection matrix Shadow is to common subspace;
Each target candidate is calculated to the distance of all To Template set, apart from score of the sum as each target candidate, Final tracking result is the average value of several minimum preceding target candidates of distance.
A kind of 2. target following side learnt based on deep learning feature and point to aggregate distance measurement as claimed in claim 1 Method, it is characterized in that, the start frame in tracking randomly selects some target samples and background sample, including:
In start frame according to target sample and the quantity ratio 1 of background sample:10 are sampled, the target sample with specify with Track region is handed over and compared more than 0.7, and the background sample is handed over specified tracing area and compared less than 0.5.
A kind of 3. target following side learnt based on deep learning feature and point to aggregate distance measurement as claimed in claim 1 Method, it is characterized in that, it is described that target sample feature extraction is carried out to target sample, background sample feature is carried out to background sample and carried Take, including:
Target sample feature extraction is carried out to target sample using depth convolutional neural networks MDNet and background sample is carried on the back Scape sample characteristics extract.
A kind of 4. target following side learnt based on deep learning feature and point to aggregate distance measurement as claimed in claim 1 Method, it is characterized in that, the target sample feature clustering by extraction is into several To Template set, by the background sample of extraction Feature clustering into several background template set, including:
K-means clusterings are used for several target sample set to the target sample feature of extraction, to each target sample This set distributes a class label;K-means clusterings are used for several background samples to the background sample feature of extraction This set, a class label is distributed to each background sample set.
A kind of 5. target following side learnt based on deep learning feature and point to aggregate distance measurement as claimed in claim 1 Method, it is characterized in that, it is described by reducing distance between generic sample and increasing the distance between different samples to learn projection matrix, Including:
Define object function to be optimized;The optimization aim of the object function includes three:Generic sample and sample set Close that space length in the projected is small, different classes of sample and sample set in the projected far;Generic The space length of sample in the projected is small, different classes of sample in the projected far;Each dimension after projection Importance is consistent;
Use the projection matrix on the projection matrix and popular world of cross-iteration Optimization Method theorem in Euclid space.
CN201710730930.6A 2017-08-23 2017-08-23 Target tracking method based on deep learning characteristics and point-to-set distance metric learning Expired - Fee Related CN107491761B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710730930.6A CN107491761B (en) 2017-08-23 2017-08-23 Target tracking method based on deep learning characteristics and point-to-set distance metric learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710730930.6A CN107491761B (en) 2017-08-23 2017-08-23 Target tracking method based on deep learning characteristics and point-to-set distance metric learning

Publications (2)

Publication Number Publication Date
CN107491761A true CN107491761A (en) 2017-12-19
CN107491761B CN107491761B (en) 2020-04-03

Family

ID=60650820

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710730930.6A Expired - Fee Related CN107491761B (en) 2017-08-23 2017-08-23 Target tracking method based on deep learning characteristics and point-to-set distance metric learning

Country Status (1)

Country Link
CN (1) CN107491761B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109711354A (en) * 2018-12-28 2019-05-03 哈尔滨工业大学(威海) A kind of method for tracking target indicating study based on video attribute

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101777184A (en) * 2009-11-11 2010-07-14 中国科学院自动化研究所 Local distance study and sequencing queue-based visual target tracking method
CN104616324A (en) * 2015-03-06 2015-05-13 厦门大学 Target tracking method based on adaptive appearance model and point-set distance metric learning
US20150279182A1 (en) * 2014-04-01 2015-10-01 Objectvideo, Inc. Complex event recognition in a sensor network
CN106097391A (en) * 2016-06-13 2016-11-09 浙江工商大学 A kind of multi-object tracking method identifying auxiliary based on deep neural network
CN105913448B (en) * 2016-05-25 2018-09-07 哈尔滨工业大学 The high spectrum image object detection method of subspace is matched based on tensor

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101777184A (en) * 2009-11-11 2010-07-14 中国科学院自动化研究所 Local distance study and sequencing queue-based visual target tracking method
US20150279182A1 (en) * 2014-04-01 2015-10-01 Objectvideo, Inc. Complex event recognition in a sensor network
CN104616324A (en) * 2015-03-06 2015-05-13 厦门大学 Target tracking method based on adaptive appearance model and point-set distance metric learning
CN105913448B (en) * 2016-05-25 2018-09-07 哈尔滨工业大学 The high spectrum image object detection method of subspace is matched based on tensor
CN106097391A (en) * 2016-06-13 2016-11-09 浙江工商大学 A kind of multi-object tracking method identifying auxiliary based on deep neural network

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
YUANKAI QI ET AL: "Hedged Deep Tracking", 《2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
YUWEI WU ET AL: "Metric Learning Based Structural Appearance Model for Robust Visual Tracking", 《IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY》 *
ZHIWU HUANG ET AL: "Learning Euclidean-to-Riemannian Metric for Point-to-Set Classification", 《2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
杨钰源: "基于度量学习和深度学习的行人重识别研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
谢英红等: "基于Grassmann流形和投影群的目标跟踪", 《仪器仪表学报》 *
钱诚: "增量型目标跟踪关键技术研究", 《中国博士学位论文全文数据库 信息科技辑》 *
陈东成: "基于机器学习的目标跟踪技术研究", 《中国博士学位论文全文数据库 信息科技辑》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109711354A (en) * 2018-12-28 2019-05-03 哈尔滨工业大学(威海) A kind of method for tracking target indicating study based on video attribute

Also Published As

Publication number Publication date
CN107491761B (en) 2020-04-03

Similar Documents

Publication Publication Date Title
CN104036255B (en) A kind of facial expression recognizing method
CN106778604B (en) Pedestrian re-identification method based on matching convolutional neural network
CN108509860A (en) HOh Xil Tibetan antelope detection method based on convolutional neural networks
CN105975968B (en) A kind of deep learning license plate character recognition method based on Caffe frame
CN104809481B (en) A kind of natural scene Method for text detection based on adaptive Color-based clustering
CN108596211B (en) Shielded pedestrian re-identification method based on centralized learning and deep network learning
CN107679078A (en) A kind of bayonet socket image vehicle method for quickly retrieving and system based on deep learning
CN103632146B (en) A kind of based on head and shoulder away from human body detecting method
CN109299688A (en) Ship Detection based on deformable fast convolution neural network
CN103413145B (en) Intra-articular irrigation method based on depth image
CN108615226A (en) A kind of image defogging method fighting network based on production
CN103440645A (en) Target tracking algorithm based on self-adaptive particle filter and sparse representation
CN105718912B (en) A kind of vehicle characteristics object detecting method based on deep learning
CN104376334B (en) A kind of pedestrian comparison method of multi-scale feature fusion
CN104598889B (en) The method and apparatus of Human bodys&#39; response
CN107944437B (en) A kind of Face detection method based on neural network and integral image
CN107622271A (en) Handwriting text lines extracting method and system
CN110599463A (en) Tongue image detection and positioning algorithm based on lightweight cascade neural network
CN110163567A (en) Classroom roll calling system based on multitask concatenated convolutional neural network
CN110956099A (en) Dynamic gesture instruction identification method
CN107330918B (en) Football video player tracking method based on online multi-instance learning
CN101826155A (en) Method for identifying act of shooting based on Haar characteristic and dynamic time sequence matching
CN106650798A (en) Indoor scene recognition method combining deep learning and sparse representation
CN102542285B (en) Image collection scene sorting method and image collection scene sorting device based on spectrogram analysis
CN103955673B (en) Body recognizing method based on head and shoulder model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20200403

Termination date: 20200823