CN103886585A - Video tracking method based on rank learning - Google Patents

Video tracking method based on rank learning Download PDF

Info

Publication number
CN103886585A
CN103886585A CN201410054630.7A CN201410054630A CN103886585A CN 103886585 A CN103886585 A CN 103886585A CN 201410054630 A CN201410054630 A CN 201410054630A CN 103886585 A CN103886585 A CN 103886585A
Authority
CN
China
Prior art keywords
target
image
sample set
training sample
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410054630.7A
Other languages
Chinese (zh)
Inventor
于慧敏
曾雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201410054630.7A priority Critical patent/CN103886585A/en
Publication of CN103886585A publication Critical patent/CN103886585A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a video tracking method based on rank learning. The method comprises the steps of firstly compressing multi-scale image features by using a sparse measurement matrix based on a compressed sensing theory, secondly using a Median-Flow tracking algorithm as a predictor to obtain the rough position of a target and constructing a training data set for an RV-SVM algorithm, and finally sorting training samples and taking the RV-SVM algorithm as a binary classifier to separate the target and a background to achieve the purpose of video tracking. The training process of the RV-SVM algorithm is a linear programming problem, the training time of online learning is reduced, and the efficiency of a tracking system is improved. Through the combination of multi-scale image compression feature extraction, the Median-Flow tracking algorithm and the RV-SVM algorithm, problems of target scale change, partial occlusion, 3D rotation, posture change, target fast movement and the like in a video tracking process can be effectively processed.

Description

A kind of video tracing method based on sequence study
Technical field
The invention belongs to computer vision and area of pattern recognition, particularly a kind of video tracing method based on sequence study.
Background technology
Video tracking is a critical research topic of computer vision field, is with a wide range of applications at aspects such as intelligent video monitoring, augmented reality, man-machine interaction, gesture identification and automatic Pilots.Recent two decades comes, although researchist has proposed very many track algorithms both at home and abroad, but it is still a very challenging problem, because efficiently video tracking algorithm need to be processed the problems such as target scale variation in real video scene, illumination variation, partial occlusion, camera rotation, deformation of body.
According to target performance modeling method difference used, track algorithm can be divided into two classes: the target tracking algorism based on generation model and the target tracking algorism based on discrimination model.First track algorithm based on generation model learns a target performance model, then the search target the most similar to this model on every two field picture.Track algorithm based on discrimination model is regarded Target Tracking Problem as a binary classification problems, by sorter of on-line study, target and background is separated.
At present, the track algorithm based on discrimination model becomes the main stream approach in video tracking field just gradually.Video tracking algorithm based on discrimination model is known as again the tracking (tracking-by-detection) based on detecting, and the key step of such algorithm is as follows: 1) known target initial position, and the positive negative sample of present frame is extracted in sampling, online training classifier; 2) read in next frame image, the invariant position of two frame targets before and after general hypothesis, abstract image sample around target location; 3) sample of extraction is sent into the sorter of training before, can be determined the reposition of target according to the highest sample of sorter score.The object of most of video tracking algorithm based on detecting in can processing section real scene changes, but all in various degree there is drifting problem, cause the loss of tracked target.
Conventionally in order to process the dimensional variation of target in tracing process, need to extract multi-scale image feature.Multi-scale image intrinsic dimensionality is too high, can utilize compressive sensing theory to carry out dimensionality reduction to multi-scale image feature.Intermediate value-light stream (Median-Flow) track algorithm tracking effect is good, and calculated amount is little, is suitable as weak tracker, and the guestimate of target location is provided.Ranking Algorithm is in the widespread use of machine learning field, as text retrieval, product grading and semantic analysis etc., recently, Ranking Algorithm starts to be used to computer vision field, ordering vector-support vector machine (Ranking Vector Support Vector Machine, RV-SVM) algorithm is a kind of up-to-date Ranking Algorithm.
Summary of the invention
The object of the invention is to overcome the deficiencies in the prior art, a kind of video tracing method based on sequence study is provided.
Video tracing method based on sequence study comprises following steps:
1) represent target O by rectangular image set of pixels, target O size is h× w, utilize multiple dimensioned rectangular filter to generate the multi-scale image feature of target O;
2), based on compressive sensing theory, utilize sparse random measurement matrix compression multi-scale image feature;
3) adopt Median-Flow track algorithm as fallout predictor, the rough position of target of prediction O in next frame image;
4) build training sample set
Figure BDA0000466923760000021
with
Figure BDA0000466923760000022
Figure BDA0000466923760000023
the target image piece collection extracting from initial frame and multiple nearest frame,
Figure BDA0000466923760000024
the background image piece collection extracting from nearest two field picture,
Figure BDA0000466923760000025
it is the weak mark training sample set extracting from current frame image;
5) utilize RV-SVM algorithm as on-line study sorter, target O and background are separated to reach the object of video tracking.
Described step 1) is:
(1) target O and a series of multiple dimensioned rectangular filter are carried out to convolution
l i , j ( x ~ , y ~ ) = s ( x ~ , y ~ ) * h i , j ( x ~ , y ~ )
Wherein,
Figure BDA0000466923760000027
represent target O, with
Figure BDA0000466923760000029
the pixel coordinate of target O,
Figure BDA00004669237600000210
r is vector space, and dimension of a vector space is w× h, be the size of target O,
Figure BDA00004669237600000211
for multiple dimensioned rectangular filter, be defined as follows
Figure BDA00004669237600000212
In formula, i and j are rectangular filter label, i=1 ..., w, j=1 ..., h;
(2) by filtered image array
Figure BDA00004669237600000213
be launched into column vector l k, k=1 ..., r, r= w× h;
(3) by column vector l kconnect into the multi-scale image feature h of higher-dimension:
h = ( l 1 T , . . . l r T ) T
In formula, h ∈ R m, m=r 2, T represents vectorial transposition.
Described step 2) be:
(1) generate sparse stochastic matrix φ, φ ∈ R n × m, the element φ of φ i,jfor:
φ i , j = s × 1 , p = 1 / 2 s 0 , p = 1 - 1 / s - 1 , p = 1 / 2 s
In formula, s chooses between 2 to 4 at random by average probability, and p represents probability;
(2) utilize sparse stochastic matrix φ to do projection to multi-scale image feature h,
x=φh
Obtain multi-scale image compressive features x, x ∈ R n × 1.
Described step 3) is:
(1) by abstract target O be 10 × 10 grid unique point, to each unique point, comprise with one the image block that this unique point and size are 4 × 4 and represent;
(2) follow the tracks of these unique points by pyramid optical flow method, the pyramidal number of plies used is 5 layers;
(3) calculate FB error and the NCC error of these unique points, wherein, the computing formula of FB error is as follows:
FB ( T f k | S ) = ED ( T f k , T b k )
In formula, S is one section of video or image sequence, and ED represents Euclidean distance, the t frame that t is video,
Figure BDA0000466923760000033
representation feature point x tforward direction is followed the tracks of k step,
Figure BDA0000466923760000034
representation feature point x t+kbackward tracking k step, T b k = ( x ^ t , x ^ t + 1 , . . . , x ^ t + k ) ;
(4) the less unique point of tracking error of selection 50%, all the other unique points of 50% are used as exterior point and are rejected;
(5) median of each Spatial Dimension of calculating residue character point is estimated the position of target O, for every pair of unique point, calculate the ratio of the Euclidean distance in their Euclidean distance and former frame images in current frame image, and then calculating the mean value of these ratios, this mean value is the yardstick of target.
Described step 4) is:
Build training sample set X t 1 , X t 1 = { x ′ : | | l s ( x ′ ) - l s * | | ≤ α , s = 1 , t - Δt , . . . , t } ;
Build training sample set X t 0 , X t 0 = { x &prime; : &gamma; < | | l s ( x &prime; ) - l s * | | < &beta; , s = t - &Delta;t , . . . , t } ;
Build weak mark training sample set X t + 1 w , X t + 1 w = { x &prime; : < | | l s ( x &prime; ) - l s w | | < &alpha; , s = t + 1 } ;
Wherein, the frame number that t is video, Δ t is the number of nearest frame, l s(x ') is illustrated in s two field picture, the position of image block x ', represent the actual position of target O,
Figure BDA0000466923760000042
the rough position of the target O obtaining for Median-Flow track algorithm, α, β and γ are sample radius.
Described step 5) is:
(1) extract mark training sample set
Figure BDA0000466923760000043
with
Figure BDA0000466923760000044
in the multi-scale image compressive features x of each image block;
(2) establish training sample set
Figure BDA0000466923760000045
in feature ordering higher than
Figure BDA0000466923760000046
with
Figure BDA0000466923760000047
in feature ordering,
Figure BDA0000466923760000048
Figure BDA0000466923760000049
and
Figure BDA00004669237600000420
, wherein, i and j are feature sequence number, i=1 ..., N 1, j=N 1+ 1 ..., N 1+ N 0, N 1for training sample set
Figure BDA00004669237600000410
in number of samples, N 0for training sample set
Figure BDA00004669237600000411
with
Figure BDA00004669237600000412
in number of samples sum;
(3) according to the condition of setting in step (2), use training sample set
Figure BDA00004669237600000413
with
Figure BDA00004669237600000414
multi-scale image compressive features x remove to train ranking functions F t+1(x), first solve linear programming problem:
minL(α,ξ)=Σ iα iijξ ij
s.t.Σ iα i(K(x i,x u)-K(x i,x v))≥1-ξ uv,α≥0,ξ≥0
In formula, u=1 ..., N 1, v=N 1+ 1 ..., N 1+ N 0, ξ is slack variable, K (. .) be kernel function, for linear kernel, K (x, z)=<x, z>;
Obtaining the optimum solution α of linear programming problem *afterwards, ranking functions F t+1(x) can be represented by the formula
F t + 1 ( x ) = &Sigma; i &alpha; i * K ( x i , x )
In formula, K (. .) be kernel function, i=1 ..., N 1;
(4) target O is separated with background, make ranking functions F t+1(x) be worth maximum image block and be the actual position of target O
l t + 1 * = l ( arg max F t + 1 ( x ) ) , x &Element; X t + 1 w
In formula,
Figure BDA00004669237600000417
first to obtain the rough position of target O in current frame image with Median-Flow track algorithm, and then the weak mark training sample set extracting, x is weak mark training sample set
Figure BDA00004669237600000418
in the multi-scale image compressive features of image block, l (x) represents the position of the image block that multi-scale image compressive features x is corresponding,
Figure BDA00004669237600000419
represent the actual position of target in present frame.
The invention has the beneficial effects as follows:
1) proposed a kind of video tracing method based on sequence study, the method is utilized a sparse stochastic matrix compression multi-scale image feature, has retained the almost full detail of feature, and has avoided dimension disaster.
2) the present invention adopts Median-Flow track algorithm as fallout predictor, the position of estimating target in next frame also builds training sample set, this process not only calculated amount is little, effective, and can effectively process the sudden change of target location in video, makes tracker robust more.
3) adopt RV-SVM algorithm as on-line learning algorithm, not only training process is transformed into a linear programming problem from a quadratic programming for inscribing, and the computing method of kernel function are simplified, thereby greatly reduce training time of on-line study, improve the efficiency of system.
4) the present invention passes through the extraction of multi-scale image compressive features, Median-Flow track algorithm and the combination of RV-SVM algorithm, realize a robust, efficient video frequency following system, can effectively process the problems such as target scale variation, partial occlusion, 3D rotation, postural change and target fast moving in video tracking process.
Brief description of the drawings
Fig. 1 is the general flow chart of the video tracing method based on sequence study;
Fig. 2 is the schematic diagram of the Median-Flow algorithm part in Fig. 1.
Embodiment
In order to make object of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.
As depicted in figs. 1 and 2, the video tracing method based on sequence study comprises following steps:
1) represent target O by rectangular image set of pixels, target O size is h× w, utilize multiple dimensioned rectangular filter to generate the multi-scale image feature of target O;
2), based on compressive sensing theory, utilize sparse random measurement matrix compression multi-scale image feature;
3) adopt Median-Flow track algorithm as fallout predictor, the rough position of target of prediction O in next frame image;
4) build training sample set
Figure BDA0000466923760000051
with
Figure BDA0000466923760000052
the target image piece collection extracting from initial frame and multiple nearest frame,
Figure BDA0000466923760000054
the background image piece collection extracting from nearest two field picture,
Figure BDA0000466923760000055
it is the weak mark training sample set extracting from current frame image;
5) utilize RV-SVM algorithm as on-line study sorter, target O and background are separated to reach the object of video tracking.
Described step 1) is:
(1) target O and a series of multiple dimensioned rectangular filter are carried out to convolution
l i , j ( x ~ , y ~ ) = s ( x ~ , y ~ ) * h i , j ( x ~ , y ~ )
Wherein,
Figure BDA0000466923760000062
represent target O, with
Figure BDA0000466923760000064
the pixel coordinate of target O,
Figure BDA0000466923760000065
r is vector space, and dimension of a vector space is w× h, be the size of target O,
Figure BDA0000466923760000066
for multiple dimensioned rectangular filter, be defined as follows
In formula, i and j are rectangular filter label, i=1 ..., w, j=1 ..., h;
(2) by filtered image array be launched into column vector l k, k=1 ..., r, r= w× h;
(3) by column vector l kconnect into the multi-scale image feature h of higher-dimension:
h = ( l 1 T , . . . , l r T ) T
In formula, h ∈ R m, m=r 2, T represents vectorial transposition.
Described step 2) be:
(1) generate sparse stochastic matrix φ, φ ∈ R n × m, the element φ of φ i,jfor:
&phi; i , j = s &times; 1 , p = 1 / 2 s 0 , p = 1 - 1 / s - 1 , p = 1 / 2 s
In formula, s chooses between 2 to 4 at random by average probability, and p represents probability;
(2) utilize sparse stochastic matrix φ to do projection to multi-scale image feature h,
x=φh
Obtain multi-scale image compressive features x, x ∈ R n × 1.
In described step 3), as shown in Figure 2, detailed step is the block diagram of Median-Flow algorithm:
(1) by abstract target O be 10 × 10 grid unique point, to each unique point, comprise with one the image block that this unique point and size are 4 × 4 and represent;
(2) follow the tracks of these unique points by pyramid optical flow method, the pyramidal number of plies used is 5 layers;
(3) calculate forward-backward algorithm (Forward-Backward, FB) error and normalized crosscorrelation (Normalized Cross-Correlation, the NCC) error of these unique points, wherein, the computing formula of FB error is as follows:
FB ( T f k | S ) = ED ( T f k , T b k )
In formula, S is one section of video or image sequence, and ED represents Euclidean distance, the t frame that t is video,
Figure BDA0000466923760000072
representation feature point xt forward direction is followed the tracks of k step,
Figure BDA0000466923760000073
Figure BDA0000466923760000074
representation feature point x t+kbackward tracking k step, T b k = ( x ^ t , x ^ t + 1 , . . . , x ^ t + k ) ;
(4) the less unique point of tracking error of selection 50%, all the other unique points of 50% are used as exterior point and are rejected;
(5) median of each Spatial Dimension of calculating residue character point is estimated the position of target O, for every pair of unique point, calculate the ratio of the Euclidean distance in their Euclidean distance and former frame images in current frame image, and then calculating the mean value of these ratios, this mean value is the yardstick of target.
Described step 4) is:
Build training sample set X t 1 , X t 1 = { x &prime; : | | l s ( x &prime; ) - l s * | | &le; &alpha; , s = 1 , t - &Delta;t , . . . , t } ;
Build training sample set X t 0 , X t 0 = { x &prime; : &gamma; < | | l s ( x &prime; ) - l s * | | < &beta; , s = t - &Delta;t , . . . , t } ;
Build weak mark training sample set X t + 1 w , X t + 1 w = { x &prime; : < | | l s ( x &prime; ) - l s w | | < &alpha; , s = t + 1 } ;
Wherein, the frame number that t is video, Δ t is the number of nearest frame, l s(x ') is illustrated in s two field picture, the position of image block x ',
Figure BDA0000466923760000079
represent the actual position of target O,
Figure BDA00004669237600000710
the rough position of the target O obtaining for Median-Flow track algorithm, α, β and γ are sample radius.
Described step 5) is:
(1) extract mark training sample set
Figure BDA00004669237600000711
with in the multi-scale image compressive features x of each image block;
(2) establish training sample set
Figure BDA00004669237600000713
in feature ordering higher than
Figure BDA00004669237600000714
with
Figure BDA00004669237600000715
in feature ordering,
Figure BDA00004669237600000716
and
Figure BDA00004669237600000723
, wherein, i and j are feature sequence number, i=1 ..., N 1, j=N 1+ 1 ..., N 1+ N 0, N 1for training sample set
Figure BDA00004669237600000718
in number of samples, N 0for training sample set
Figure BDA00004669237600000719
with
Figure BDA00004669237600000720
in number of samples sum;
(3) according to the condition of setting in step (2), use training sample set
Figure BDA00004669237600000721
with
Figure BDA00004669237600000722
multi-scale image compressive features x remove to train ranking functions F t+1(x), first solve linear programming problem:
minL(α,ξ)=Σ iα iijξ ij
s.t.Σ iα i(K(x i,x u)-K(x i,x v))≥1-ξ uv,α≥0,ξ≥0
In formula, u=1 ..., N 1, v=N 1+ 1 ..., N 1+ N 0, ξ is slack variable, K (. .) be kernel function, for linear kernel, K (x, z)=<x, z>;
Obtaining the optimum solution α of linear programming problem *afterwards, ranking functions F t+1(x) can be represented by the formula
F t + 1 ( x ) = &Sigma; i &alpha; i * K ( x i , x )
In formula, K (. .) be kernel function, i=1 ..., N 1;
(4) target O is separated with background, make ranking functions F t+1(x) be worth maximum image block and be the actual position of target O
l t + 1 * = l ( arg max F t + 1 ( x ) ) , x &Element; X t + 1 w
In formula,
Figure BDA0000466923760000083
first to obtain the rough position of target O in current frame image with Median-Flow track algorithm, and then the weak mark training sample set extracting, x is weak mark training sample set in the multi-scale image compressive features of image block, l (x) represents the position of the image block that multi-scale image compressive features x is corresponding, represent the actual position of target in present frame.
Embodiment 1
Based on a video tracing method for sequence study, comprise the following steps:
1) read in initial image frame, initialization target location parameter wherein for the coordinate of target top left corner pixel point, wwith hrepresent width and the height of target.
2) extracting objects image block sample set
Figure BDA0000466923760000088
with background sample set
Figure BDA0000466923760000089
X t 1 = { x &prime; : | | l s ( x &prime; ) - l s * | | &le; &alpha; , s = 1 , t - &Delta;t , . . . , t }
X t 0 = { x &prime; : &gamma; < | | l s ( x &prime; ) - l s * | | < &beta; , s = t - &Delta;t , . . . , t } ;
Wherein, the frame number that t is video, Δ t is the number of nearest frame, Δ t=2, l s(x ') is illustrated in s two field picture, the position of image block x ',
Figure BDA00004669237600000812
the actual position that represents target, α, β and γ are sample radius, α=4, γ=8, β=30.
3) the multi-scale image compressive features x of each image block in extraction sample set.
3.1) target O and a series of multiple dimensioned rectangular filter are carried out to convolution
l i , j ( x ~ , y ~ ) = s ( x ~ , y ~ ) * h i , j ( x ~ , y ~ )
Wherein,
Figure BDA0000466923760000092
represent target O,
Figure BDA0000466923760000093
with
Figure BDA0000466923760000094
the pixel coordinate of target O,
Figure BDA0000466923760000095
r is vector space, and dimension of a vector space is w× h, be the size of target O, for multiple dimensioned rectangular filter, be defined as follows
Figure BDA0000466923760000097
In formula, i and j are rectangular filter label, i=1 ..., w, j=1 ..., h;
3.2) by filtered image array
Figure BDA0000466923760000098
be launched into column vector l k, k=1 ..., r, r= w× h;
3.3) by column vector l kconnect into the multi-scale image feature h of higher-dimension:
h = ( l 1 T , . . . , l r T ) T
In formula, h ∈ R m, m=r 2, T represents vectorial transposition.
3.4) generate sparse stochastic matrix φ, φ ∈ R n × m, the element φ of φ i,jfor:
&phi; i , j = s &times; 1 , p = 1 / 2 s 0 , p = 1 - 1 / s - 1 , p = 1 / 2 s
In formula, s chooses between 2 to 4 at random by average probability, and p represents probability;
3.5) utilize sparse stochastic matrix φ to do projection to multi-scale image feature h,
x=φh
Obtain multi-scale image compressive features x, x ∈ R n × 1.
4) read in next frame image.
5) utilize the position of Median-Flow algorithm estimation target in next frame:
5.1) by abstract target be 10 × 10 grid unique point, to each unique point, comprise this unique point with one, and the image block that size is 4 × 4 represents;
5.2) follow the tracks of these unique points by pyramid optical flow method, the pyramidal number of plies used is 5 layers;
5.3) calculate FB error and the NCC error of these unique points, wherein, the computing formula of FB error is as follows:
FB ( T f k | S ) = ED ( T f k , T b k )
In formula, S is one section of video or image sequence, and ED represents Euclidean distance, the t frame that t is video,
Figure BDA0000466923760000102
representation feature point x tforward direction is followed the tracks of k step,
Figure BDA0000466923760000103
Figure BDA0000466923760000104
representation feature point x t+kbackward tracking k step, T b k = ( x ^ t , x ^ t + 1 , . . . , x ^ t + k ) , k=5;
5.4) the less unique point of tracking error of selection 50%, all the other unique points of 50% are used as exterior point and are rejected;
5.5) median of each Spatial Dimension of calculating residue character point is estimated the position of target, in addition, for every pair of unique point, calculate the ratio of the Euclidean distance in their Euclidean distance and former frame images in current frame image, and then calculating the mean value of these ratios, this mean value is the yardstick of target.
6) the target rough position of estimating according to Median-Flow algorithm extracts weak marking image piece sample set
Figure BDA0000466923760000106
X t + 1 w = { x &prime; : < | | l s ( x &prime; ) - l s w | | < &alpha; , s = t + 1 }
Wherein, the frame number that t is video, l s(x ') is illustrated in s two field picture, the position of image block x ',
Figure BDA0000466923760000108
the rough position of the target obtaining for Median-Flow track algorithm, α is sample radius, α=4.
7) the weak marker samples of extraction is concentrated the multi-scale image compressive features x of each image block, and concrete computation process is the same with step 3).
8) the training sample feature of step 3) and step 7) extraction is sorted, ordering rule is
Figure BDA0000466923760000109
in feature ordering higher than
Figure BDA00004669237600001010
with
Figure BDA00004669237600001011
in feature,
Figure BDA00004669237600001012
and
Figure BDA00004669237600001018
, wherein, i and j are feature sequence number, i=1 ..., N 1, j=N 1+ 1 ..., N 1+ N 0, N 1for training sample set
Figure BDA00004669237600001013
in number of samples, N 0for training sample set
Figure BDA00004669237600001014
with
Figure BDA00004669237600001015
in number of samples sum.
9) study ranking functions F t+1(x), target and background is separated, obtain the actual position of target, specific practice is as follows:
9.1) use with
Figure BDA00004669237600001017
in proper vector and sort criteria 8) training ranking functions F t+1(x), first solve linear programming problem:
minL(α,ξ)=Σ iα iijξ ij
s.t.Σ iα i(K(x i,x u)-K(x i,x v))≥1-ξ uv,α≥0,ξ≥0
In formula, u=1 ..., N 1, v=N1+1 ..., N 1+ N 0, ξ is slack variable, K (. .) be kernel function, for linear kernel, K (x, z)=<x, z>;
9.2) obtaining the optimum solution α of linear programming problem *afterwards, ranking functions F t+1(x) can be represented by the formula
F t + 1 ( x ) = &Sigma; i &alpha; i * K ( x i , x )
In formula, K (. .) be kernel function, i=1 ..., N 1;
9.3) make ranking functions F t+1(x) be worth maximum image block and be the actual position of target
l t + 1 * = l ( arg max F t + 1 ( x ) ) , x &Element; X t + 1 w
In formula,
Figure BDA0000466923760000113
first to obtain the rough position of target in current frame image with Median-Flow track algorithm, and then the weak mark training sample set extracting, x is weak mark training sample set
Figure BDA0000466923760000114
in the multi-scale image compressive features of image block, l (x) represents the position of the image block that multi-scale image compressive features x is corresponding,
Figure BDA0000466923760000115
represent the actual position of target in present frame.
10) according to actual position extracting objects image block sample set and the background sample set of target, and extract the multi-scale compress feature of each image block in sample set, this process is with step 2) and step 3).
11) determine whether video last frame, if whole algorithm flow finishes, words that no, forward step 4) to.
The foregoing is only preferred embodiment of the present invention, not with restriction the present invention, all any amendments of doing within the spirit and principles in the present invention, be equal to and replace and improvement etc., within all should being included in protection scope of the present invention.

Claims (6)

1. the video tracing method based on sequence study, is characterized in that, comprises following steps:
1) represent target O by rectangular image set of pixels, target O size is h× w, utilize multiple dimensioned rectangular filter to generate the multi-scale image feature of target O;
2), based on compressive sensing theory, utilize sparse random measurement matrix compression multi-scale image feature;
3) adopt Median-Flow track algorithm as fallout predictor, the rough position of target of prediction O in next frame image;
4) build training sample set
Figure FDA0000466923750000011
with
Figure FDA0000466923750000012
the target image piece collection extracting from initial frame and multiple nearest frame,
Figure FDA0000466923750000014
the background image piece collection extracting from nearest two field picture,
Figure FDA0000466923750000015
it is the weak mark training sample set extracting from current frame image;
5) utilize RV-SVM algorithm as on-line study sorter, target O and background are separated to reach the object of video tracking.
2. the video tracing method based on sequence study according to claim 1, is characterized in that, described step 1) is:
(1) target O and a series of multiple dimensioned rectangular filter are carried out to convolution
l i , j ( x ~ , y ~ ) = s ( x ~ , y ~ ) * h i , j ( x ~ , y ~ )
Wherein, represent target O,
Figure FDA0000466923750000018
with
Figure FDA0000466923750000019
the pixel coordinate of target O,
Figure FDA00004669237500000110
r is vector space, and dimension of a vector space is w× h, be the size of target O,
Figure FDA00004669237500000111
for multiple dimensioned rectangular filter, be defined as follows
In formula, i and j are rectangular filter label, i=1 ..., w, j=1 ..., h;
(2) by filtered image array
Figure FDA00004669237500000113
be launched into column vector l k, k=1 ..., r, r= w× h;
(3) by column vector l kconnect into the multi-scale image feature h of higher-dimension:
h = ( l 1 T , . . . , l r T ) T
In formula, h ∈ R m, m=r 2, T represents vectorial transposition.
3. the video tracing method based on sequence study according to claim 1, is characterized in that described step 2) be:
(1) generate sparse stochastic matrix φ, φ ∈ R n × m, the element φ of φ i,jfor:
&phi; i , j = s &times; 1 , p = 1 / 2 s 0 , p = 1 - 1 / s - 1 , p = 1 / 2 s
In formula, s chooses between 2 to 4 at random by average probability, and p represents probability;
(2) utilize sparse stochastic matrix φ to do projection to multi-scale image feature h,
x=φh
Obtain multi-scale image compressive features x, x ∈ R n × 1.
4. the video tracing method based on sequence study according to claim 1, is characterized in that, described step 3) is:
(1) by abstract target O be 10 × 10 grid unique point, to each unique point, comprise with one the image block that this unique point and size are 4 × 4 and represent;
(2) follow the tracks of these unique points by pyramid optical flow method, the pyramidal number of plies used is 5 layers;
(3) calculate FB error and the NCC error of these unique points, wherein, the computing formula of FB error is as follows:
FB ( T f k | S ) = ED ( T f k , T b k )
In formula, S is one section of video or image sequence, and ED represents Euclidean distance, the t frame that t is video, representation feature point x tforward direction is followed the tracks of k step,
Figure FDA0000466923750000024
Figure FDA0000466923750000025
representation feature point x t+kbackward tracking k step, T b k = ( x ^ t , x ^ t + 1 , . . . , x ^ t + k ) ;
(4) the less unique point of tracking error of selection 50%, all the other unique points of 50% are used as exterior point and are rejected;
(5) median of each Spatial Dimension of calculating residue character point is estimated the position of target O, for every pair of unique point, calculate the ratio of the Euclidean distance in their Euclidean distance and former frame images in current frame image, and then calculating the mean value of these ratios, this mean value is the yardstick of target.
5. the video tracing method based on sequence study according to claim 1, is characterized in that, described step 4) is:
Build training sample set X t 1 , X t 1 = { x &prime; : | | l s ( x &prime; ) - l s * | | &le; &alpha; , s = 1 , t - &Delta;t , . . . , t } ;
Build training sample set X t 0 , X t 0 = { x &prime; : &gamma; < | | l s ( x &prime; ) - l s * | | < &beta; , s = t - &Delta;t , . . . , t } ;
Build weak mark training sample set X t + 1 w , X t + 1 w = { x &prime; : < | | l s ( x &prime; ) - l s w | | < &alpha; , s = t + 1 } ;
Wherein, the frame number that t is video, Δ t is the number of nearest frame, l s(x ') is illustrated in s two field picture, the position of image block x ',
Figure FDA00004669237500000318
represent the actual position of target O, the rough position of the target O obtaining for Median-Flow track algorithm, α, β and γ are sample radius.
6. the video tracing method based on sequence study according to claim 1, is characterized in that, described step 5) is:
(1) extract mark training sample set
Figure FDA0000466923750000031
with
Figure FDA0000466923750000032
in the multi-scale image compressive features x of each image block;
(2) establish training sample set in feature ordering higher than
Figure FDA0000466923750000034
with
Figure FDA0000466923750000035
in feature ordering,
Figure FDA0000466923750000036
Figure FDA0000466923750000037
and
Figure FDA00004669237500000320
, wherein, i and j are feature sequence number, i=1 ..., N 1, j=N 1+ 1 ..., N 1+ N 0, N 1for training sample set
Figure FDA0000466923750000038
in number of samples, N 0for training sample set
Figure FDA0000466923750000039
with
Figure FDA00004669237500000310
in number of samples sum;
(3) according to the condition of setting in step (2), use training sample set
Figure FDA00004669237500000311
with
Figure FDA00004669237500000312
multi-scale image compressive features x remove to train ranking functions F t+1(x), first solve linear programming problem:
minL(α,ξ)=Σ iα iijξ ij
s.t.Σ iα i(K(x i,x u)-K(x i,x v))≥1-ξ uv,α≥0,ξ≥0
In formula, u=1 ..., N 1, v=N 1+ 1 ..., N 1+ N 0, ξ is slack variable, K (. .) be kernel function, for linear kernel, K (x, z)=<x, z>;
Obtaining the optimum solution α of linear programming problem *afterwards, ranking functions F t+1(x) can be represented by the formula
F t + 1 ( x ) = &Sigma; i &alpha; i * K ( x i , x )
In formula, K (. .) be kernel function, i=1 ..., N 1;
(4) target O is separated with background, make ranking functions F t+1(x) be worth maximum image block and be the actual position of target O
l t + 1 * = l ( arg max F t + 1 ( x ) ) , x &Element; X t + 1 w
In formula, first to obtain the rough position of target O in current frame image with Median-Flow track algorithm, and then the weak mark training sample set extracting, x is weak mark training sample set
Figure FDA00004669237500000316
in the multi-scale image compressive features of image block, l (x) represents the position of the image block that multi-scale image compressive features x is corresponding, represent the actual position of target in present frame.
CN201410054630.7A 2014-02-18 2014-02-18 Video tracking method based on rank learning Pending CN103886585A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410054630.7A CN103886585A (en) 2014-02-18 2014-02-18 Video tracking method based on rank learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410054630.7A CN103886585A (en) 2014-02-18 2014-02-18 Video tracking method based on rank learning

Publications (1)

Publication Number Publication Date
CN103886585A true CN103886585A (en) 2014-06-25

Family

ID=50955458

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410054630.7A Pending CN103886585A (en) 2014-02-18 2014-02-18 Video tracking method based on rank learning

Country Status (1)

Country Link
CN (1) CN103886585A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104680143A (en) * 2015-02-28 2015-06-03 武汉烽火众智数字技术有限责任公司 Quick image search method for video investigation
CN105976390A (en) * 2016-05-25 2016-09-28 南京信息职业技术学院 Steel tube counting method by combining support vector machine threshold statistics and spot detection
CN105989613A (en) * 2015-02-05 2016-10-05 南京市客运交通管理处 Passenger flow tracking algorithm suitable for bus scene
CN107393523A (en) * 2017-07-28 2017-11-24 深圳市盛路物联通讯技术有限公司 A kind of noise monitoring method and system
CN108665479A (en) * 2017-06-08 2018-10-16 西安电子科技大学 Infrared object tracking method based on compression domain Analysis On Multi-scale Features TLD
CN109933715A (en) * 2019-03-18 2019-06-25 杭州电子科技大学 One kind being based on listwise algorithm on-line study sort method
CN110264492A (en) * 2019-06-03 2019-09-20 浙江大学 A kind of efficient satellite image self-correction multi-object tracking method
CN110461270A (en) * 2017-02-14 2019-11-15 阿特雷塞斯有限责任公司 High speed optical tracking with compression and/or CMOS windowing
CN110785775A (en) * 2017-07-07 2020-02-11 三星电子株式会社 System and method for optical tracking

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101609452A (en) * 2009-07-10 2009-12-23 南方医科大学 The fuzzy SVM feedback that is used for the medical image target identification is estimated method
CN101661559A (en) * 2009-09-16 2010-03-03 中国科学院计算技术研究所 Digital image training and detecting methods
CN102147866A (en) * 2011-04-20 2011-08-10 上海交通大学 Target identification method based on training Adaboost and support vector machine

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101609452A (en) * 2009-07-10 2009-12-23 南方医科大学 The fuzzy SVM feedback that is used for the medical image target identification is estimated method
CN101661559A (en) * 2009-09-16 2010-03-03 中国科学院计算技术研究所 Digital image training and detecting methods
CN102147866A (en) * 2011-04-20 2011-08-10 上海交通大学 Target identification method based on training Adaboost and support vector machine

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
HWANJO YU 等: "An efficient method for learning nonlinear ranking SVM functions", 《INFORMATION SCIENCES》 *
KAIHUA ZHANG 等: "Real-Time Compressive Tracking", 《COMPUTER VISION》 *
YANCHENG BAI 等: "Robust Tracking viaWeakly Supervised Ranking SVM", 《COMPUTER VISION AND PATTERN RECOGNITION》 *
ZDENEK KALAL 等: "Forward-Backward Error: Automatic Detection of Tracking Failures", 《PATTERN RECOGNITION》 *
赵璐,于慧敏: "基于先验形状信息和水平集方法的车辆检测", 《浙江大学学报(工学版)》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105989613A (en) * 2015-02-05 2016-10-05 南京市客运交通管理处 Passenger flow tracking algorithm suitable for bus scene
CN104680143A (en) * 2015-02-28 2015-06-03 武汉烽火众智数字技术有限责任公司 Quick image search method for video investigation
CN104680143B (en) * 2015-02-28 2018-02-27 武汉烽火众智数字技术有限责任公司 A kind of fast image retrieval method for video investigation
CN105976390A (en) * 2016-05-25 2016-09-28 南京信息职业技术学院 Steel tube counting method by combining support vector machine threshold statistics and spot detection
CN105976390B (en) * 2016-05-25 2018-09-18 南京信息职业技术学院 A kind of steel pipe method of counting of combination supporting vector machine threshold statistical and spot detection
CN110461270A (en) * 2017-02-14 2019-11-15 阿特雷塞斯有限责任公司 High speed optical tracking with compression and/or CMOS windowing
CN108665479A (en) * 2017-06-08 2018-10-16 西安电子科技大学 Infrared object tracking method based on compression domain Analysis On Multi-scale Features TLD
CN110785775A (en) * 2017-07-07 2020-02-11 三星电子株式会社 System and method for optical tracking
CN110785775B (en) * 2017-07-07 2023-12-01 三星电子株式会社 System and method for optical tracking
CN107393523A (en) * 2017-07-28 2017-11-24 深圳市盛路物联通讯技术有限公司 A kind of noise monitoring method and system
CN107393523B (en) * 2017-07-28 2020-11-13 深圳市盛路物联通讯技术有限公司 Noise monitoring method and system
CN109933715A (en) * 2019-03-18 2019-06-25 杭州电子科技大学 One kind being based on listwise algorithm on-line study sort method
CN109933715B (en) * 2019-03-18 2021-05-28 杭州电子科技大学 Online learning ordering method based on listwise algorithm
CN110264492A (en) * 2019-06-03 2019-09-20 浙江大学 A kind of efficient satellite image self-correction multi-object tracking method
CN110264492B (en) * 2019-06-03 2021-03-23 浙江大学 Efficient satellite image self-correction multi-target tracking method

Similar Documents

Publication Publication Date Title
CN103886585A (en) Video tracking method based on rank learning
Zhu et al. Fusing spatiotemporal features and joints for 3d action recognition
Jiang et al. Recognizing human actions by learning and matching shape-motion prototype trees
CN102034096B (en) Video event recognition method based on top-down motion attention mechanism
Malik et al. The three R’s of computer vision: Recognition, reconstruction and reorganization
CN105160310A (en) 3D (three-dimensional) convolutional neural network based human body behavior recognition method
CN103336957A (en) Network coderivative video detection method based on spatial-temporal characteristics
CN103279768A (en) Method for identifying faces in videos based on incremental learning of face partitioning visual representations
CN103886325A (en) Cyclic matrix video tracking method with partition
Lin et al. Hand-raising gesture detection in real classroom
Lin et al. Deep learning of spatio-temporal features with geometric-based moving point detection for motion segmentation
CN105574545B (en) The semantic cutting method of street environment image various visual angles and device
Ibrahem et al. Real-time weakly supervised object detection using center-of-features localization
CN103413154A (en) Human motion identification method based on normalized class Google measurement matrix
CN104778699A (en) Adaptive object feature tracking method
Basavaiah et al. Human activity detection and action recognition in videos using convolutional neural networks
Sriram et al. Analytical review and study on object detection techniques in the image
Murtaza et al. Multi-view human action recognition using histograms of oriented gradients (HOG) description of motion history images (MHIs)
Arunnehru et al. Automatic activity recognition for video surveillance
Zheng et al. Bi-heterogeneous Convolutional Neural Network for UAV-based dynamic scene classification
Yang et al. MediaCCNY at TRECVID 2012: Surveillance Event Detection.
Liu et al. Mean shift fusion color histogram algorithm for nonrigid complex target tracking in sports video
Ji et al. A view-invariant action recognition based on multi-view space hidden markov models
Palmer et al. Scale proportionate histograms of oriented gradients for object detection in co-registered visual and range data
Wachs et al. Recognizing Human Postures and Poses in Monocular Still Images.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20140625

RJ01 Rejection of invention patent application after publication