CN101493889B - Method and apparatus for tracking video object - Google Patents

Method and apparatus for tracking video object Download PDF

Info

Publication number
CN101493889B
CN101493889B CN2008100005828A CN200810000582A CN101493889B CN 101493889 B CN101493889 B CN 101493889B CN 2008100005828 A CN2008100005828 A CN 2008100005828A CN 200810000582 A CN200810000582 A CN 200810000582A CN 101493889 B CN101493889 B CN 101493889B
Authority
CN
China
Prior art keywords
contour
video object
mrow
msub
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2008100005828A
Other languages
Chinese (zh)
Other versions
CN101493889A (en
Inventor
赵光耀
于纪征
孔晓东
曾贵华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN2008100005828A priority Critical patent/CN101493889B/en
Publication of CN101493889A publication Critical patent/CN101493889A/en
Application granted granted Critical
Publication of CN101493889B publication Critical patent/CN101493889B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The embodiment of the invention provides a method for tracking a video object and a device thereof, and relates to the technical field of image processing. The method and the device are invented for realizing the accurate tracking of the video object. The method comprises the steps of: picking feature points of the profile of the video object in a current frame of image; finding out matched feature points; detecting at least one alternative profile of the video object in a next frame of image; calculating the profile feature value of the video object in the current frame of image; calculating the profile feature value of the alternative profile; and comparing the profile feature value of the alternative profile with that of the video object in the current frame of image and taking the alternative profile as the profile of the video object in next frame of image if the two profile feature values are matched. The method and the device can improve the accuracy of tracking the video object.

Description

Method and device for tracking video object
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to a method and an apparatus for tracking a video object.
Background
Computer vision refers to the identification, tracking, measurement and the like of a tracking target by using a camera, a computer and other equipment instead of human eyes. The real-time tracking technology of the video object is an important subject in the field of computer vision, and is the basis of a series of work such as video analysis, video understanding, video object recognition, video object behavior analysis and the like.
Currently, there are many methods for video object tracking. The method of tracking video objects may be classified into a method of tracking video objects based on detection and a method of tracking video objects based on recognition according to whether pattern matching is required between frames of an image.
The method for tracking the video object based on detection directly extracts the outline of the video object in each frame of image according to a certain characteristic of the video object without transmitting object motion state parameters, matching the outline and the like among frames of the image. Methods of tracking video objects based on detection include methods of differential detection and the like. In the method for tracking a video object based on recognition, a certain feature of the video object is usually extracted first, and then an area that most matches the feature is searched in each frame of image, where the most matching area is the video object.
Among the two methods for tracking the video object, the method for tracking the video object based on detection has simple algorithm and easy realization, but the tracking effect is not ideal. The main research direction of current video object tracking technology has therefore moved to recognition-based methods.
Among methods for tracking video objects based on recognition, the convergence (conditional density propagation) tracking algorithm, that is, the conditional probability density propagation algorithm, is one of the most widely used contour tracking methods.
The Condensation tracking algorithm is one of particle filter based tracking algorithms. Particle filtering is also known as Sequential Monte Carlo (SMC), which is a method for implementing bayesian recursive filtering using a Monte Carlo method. It uses a set of weightsTo represent the posterior probability density p (x) of the system state vectork|x1:k) When the number of samples is large enough, the probability estimation is equivalent to the posterior probability density function.
In the Condensation tracking algorithm, a contour representation method for representing a video object by adopting a movable contour model and a shape space is adopted, a contour curve of the video object is represented by using control points of B-Snake, and possible changes of the contour curve, such as translation, rotation and the like, are represented by using the shape space.
The motion state parameter T of the video object outline can be expressed as: t ═ T (TX, TY, θ, SX, SY), where TX and TY are the center points of the video object in the x and y directions, respectively, θ is the angle by which the contour of the video object is rotated, and SX and SY are the dimensions of the video object in the x and y directions, respectively. The shape space parameter S of the video object in the shape space is represented as: s ═ S (TX, TY, SX cos θ -1, SY cos θ -1, -SY sin θ, SX sin θ).
The process of video object tracking using the Condensation tracking algorithm is as follows.
1) Obtaining initial value T of state of video object motion from initial frame image0Initializing NsParticle, initial weight
Figure GDA0000057822220000021
Is 1/NsThe motion state and the shape space parameter of each particle are respectively(i=1,2,......Ns). In the k-th frame, the state transition is performed for each particle state. The state transition equation is shown in equation (11):
TXk i=TXk-1 i+B1×ξ1-k i
TYk i=TYk-1 i+B2×ξ2-k i
θk i=θi k-1+B3×ξ3-k i;(11)
SXk i=SXi k-1+B4×ξ4-k i
SYk i=SYi k-1+B5×ξ5-k i
wherein, B1,B2,B3,B4,B5Is constant and xi is [ -1, 1]The random number of (2).
2) Each candidate particle is evaluated by using the observed value (motion state parameter T, shape space parameter S, etc.) of the current frame image, and the weight value of each particle is calculated.
The specific process is as follows:
21) for particle NiCalculating the motion state parameter T according to the method in the formula (1)iAnd a shape space parameter Si
22) According to the motion state parameter TiAnd a shape space parameter SiObtaining the particle NiAnd fitting a contour curve of the video object by the control points of B-Snake.
23) And sampling N sampling points on the contour curve of the video object, and solving the pixel point with the maximum gradient of each sampling point in the normal direction.
24) Finding out the distance DIS between each sample point on the profile curve and the pixel point with the maximum gradient on the normal line of the sample pointi(N), (N ═ 1, 2.. cndot.) this is used as a measure to obtain particles NiIs observed as a function of probability densityAnd to particle NiThe weight value of (2) is updated, and the weight value of (2) is updated
Figure GDA0000057822220000032
The calculation formula of (a) is as follows:
Figure GDA0000057822220000033
by the motion state parameter T and weight of each particle
Figure GDA0000057822220000034
Carrying out weighted summation to obtain the expected motion state parameters of each particle, and further obtaining the expected shape space parameters S of the video objectkB-Snake control points and profile curves. Thus, the process of tracking the outline of the video object is completed.
In the process of implementing the invention, the inventor finds that the prior art has the following problems:
the Condensation tracking algorithm can realize real-time tracking of the video object contour with affine change (such as rotation, translation, scaling and the like). For example, when the video object is a rigid body, the rigid body is not separated from its components during the motion process, so that the rigid body can be accurately tracked by the convergence tracking algorithm. However, for video objects with non-affine changes, such as the situation that the arm is bent during the walking process of the human body, the Condensation tracking algorithm cannot accurately track the video objects. In addition, the computational complexity of the convergence tracking algorithm is complex, so the convergence tracking algorithm tracks the video object, and the tracking speed is low.
Disclosure of Invention
In order to solve the problems of low tracking speed and poor accuracy of a video object in the prior art, the embodiment of the invention provides a method and a device for tracking the video object.
In one aspect, an embodiment of the present invention provides a method for tracking a video object, where the method includes the following steps:
taking characteristic points of the video object contour in the current frame image;
finding matched feature points matched with the feature points in the next frame of image;
detecting at least one candidate contour of the video object in a next frame of image according to the matched feature points;
calculating a contour characteristic value of a video object in the current frame image;
calculating contour characteristic values of the candidate contours;
and comparing the contour characteristic value of the candidate contour with the contour characteristic value of the video object in the current frame image, and if the contour characteristic value of the candidate contour is matched with the contour characteristic value of the video object in the next frame image, determining that the candidate contour is the contour of the video object in the next frame image.
According to the method provided by the embodiment of the invention, firstly, the matching feature point of the video object in the current frame in the next frame image is determined, and then the candidate contour of the video object in the next frame image is detected according to the matching feature point. And then matching the contour characteristic values of the video object in the front frame image and the rear frame image, wherein if the contour characteristic values of the video object in the front frame image and the rear frame image are matched, the candidate contour is the contour of the video object in the next frame image. When the video object is subjected to non-affine change, the contour characteristic value of the video object in the next frame image can be extracted, and the contour characteristic value which is most matched with the contour characteristic value of the video object in the current frame image is obtained by matching the contour characteristic value of the video object in the current frame image, so that the contour of the video object in the next frame image can be accurately described. The method of the embodiment of the invention overcomes the defect that the video object with non-affine change can not be accurately tracked in the prior art. In addition, the method reduces the operation amount in the process of tracking the video object and improves the tracking speed.
Therefore, the method for tracking the video object in the embodiment of the invention not only can accurately track the video object with affine change, but also can track the video object with non-affine change, thereby improving the accuracy of tracking the video object.
In another aspect, an embodiment of the present invention provides an apparatus for tracking a video object, the apparatus including:
the first positioning unit is used for acquiring the characteristic points of the video object outline in the current frame image;
the second positioning unit is used for finding matched feature points matched with the feature points in the next frame of image;
the contour detection unit is used for detecting at least one candidate contour of the video object in the next frame of image according to the matched feature points, and comprises: the region prediction module is used for obtaining the appearance region of the video object contour in the next frame image through linear transformation by taking the matched feature point as a center; a contour selection module for detecting at least one candidate contour of the video object within the occurrence region;
the first calculating unit is used for calculating the contour characteristic value of the video object in the current frame image;
a second calculation unit, configured to calculate a contour feature value of the candidate contour;
and the contour matching unit is used for comparing the contour characteristic value of the candidate contour with the contour characteristic value of the video object in the current frame image, and if the contour characteristic value of the candidate contour is matched with the contour characteristic value of the video object in the next frame image, the candidate contour is the contour of the video object in the next frame image.
According to the device provided by the embodiment of the invention, the candidate contour of the video object in the next frame image is determined by the contour detection unit, the first and second calculation units respectively calculate the contour characteristic values of the video object in the front and rear frames of images, the contour matching unit matches the two contour characteristic values, and when the video object is subjected to non-affine change, accurate description of the contour of the video object can be obtained by matching the contour characteristic values of the video object in the front and rear frames of images. When the video object is subjected to non-affine change, the contour characteristic value of the video object in the next frame image can be extracted, and the contour characteristic value which is most matched with the contour characteristic value of the video object in the current frame image is obtained by matching the contour characteristic value of the video object in the current frame image, so that the contour of the video object in the next frame image can be accurately described. By utilizing the device provided by the embodiment of the invention, the defect that the video object with non-affine change cannot be accurately tracked in the prior art is overcome. In addition, the device of the embodiment of the invention reduces the operation amount in the process of tracking the video object and improves the tracking speed.
Therefore, the device for tracking the video object in the embodiment of the invention not only can accurately track the video object with affine change, but also can track the video object with non-affine change, thereby improving the accuracy of tracking the video object.
Drawings
FIG. 1 is a flow diagram of a method of tracking a video object according to an embodiment of the invention;
FIG. 2 is a schematic diagram of a method of tracking a video object according to an embodiment of the invention;
FIG. 3 is a diagram of a first embodiment of a method for tracking video objects according to an embodiment of the present invention;
FIG. 4 is a diagram showing the results of a Harr wavelet transform in a method of tracking a video object according to an embodiment of the present invention;
FIG. 5 is a diagram of a wavelet profile descriptor for different resolutions of a method for tracking a video object according to an embodiment of the present invention;
FIG. 6 is a graph of experimental results of a method of tracking a video object using an embodiment of the present invention;
FIG. 7 is a graph of yet another experimental result of a method of tracking a video object using an embodiment of the present invention;
FIG. 8 is a schematic diagram of an apparatus for tracking video objects according to an embodiment of the present invention;
fig. 9 is a schematic diagram of an apparatus for tracking video objects according to an embodiment of the present invention.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings according to these drawings without inventive exercise.
In order to accurately track a video object with affine change and non-affine change, the method for tracking the video object in the embodiment of the invention comprises the steps of firstly obtaining a contour characteristic value of the video object in a current frame image; then, obtaining the characteristic points of the video object in the current frame image by using a mean shift method, and obtaining the matching characteristic points of the video object in the next frame image; then according to the matching feature points of the video object in the next frame image, obtaining the candidate contour of the video object in the appearance area, and solving the contour feature value of the video object in the next frame image; and finally, matching the contour characteristic value of the video object in the current frame image with the contour characteristic value of the video object in the next frame image to obtain the contour of the video object in the next frame image.
In order to make the technical advantages of the embodiments of the present invention clearer, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings.
As shown in fig. 1, a method for tracking a video object according to an embodiment of the present invention includes the following steps:
s1, taking the feature points of the video object outline in the current frame image;
s2, finding out matched feature points matched with the feature points in the next frame of image;
s3, detecting at least one candidate contour of the video object in the next frame of image according to the matched feature points;
s4, calculating the contour characteristic value of the video object in the current frame image;
s5, calculating contour characteristic values of the candidate contours;
s6, comparing the contour characteristic value of the candidate contour with the contour characteristic value of the video object in the current frame image, if the contour characteristic value of the candidate contour is matched with the contour characteristic value of the video object in the current frame image, the candidate contour is the contour of the video object in the next frame image.
According to the method provided by the embodiment of the invention, firstly, the matching feature points of the video object in the current frame in the next frame image are determined, then the appearance area of the video object is predicted according to the matching feature points, and the candidate outline of the video object is detected in the predicted appearance area. And then matching the contour characteristic values of the video object in the front frame image and the rear frame image, wherein if the contour characteristic values of the video object in the front frame image and the rear frame image are matched, the candidate contour is the contour of the video object in the next frame image.
When the video object is subjected to non-affine change, the contour characteristic value of the video object in the next frame image can be extracted, and the contour characteristic value which is most matched with the contour characteristic value of the video object in the current frame image is obtained by matching the contour characteristic value of the video object in the current frame image, so that the contour of the video object in the next frame image can be accurately described. The method of the embodiment of the invention avoids the defect that the video object with non-affine change can not be accurately tracked in the prior art. Therefore, the method for tracking the video object in the embodiment of the invention not only can accurately track the video object with affine change, but also can track the video object with non-affine change, thereby improving the accuracy of tracking the video object. In addition, because the algorithm of the embodiment of the invention is simple, compared with the prior art, the method of the embodiment of the invention can improve the speed of tracking the video object.
As shown in fig. 2, the step S3 of detecting at least one candidate contour of the video object in the next frame of image according to the matching feature points includes:
s31: predicting the appearance area of the video object outline in the next frame image according to the matched feature points;
s32: at least one candidate contour of the video object is detected within the predicted region of occurrence.
Because the appearance area of the video object in the next frame of image is predicted at first, the operation amount of contour matching of the video object is reduced, and the efficiency of tracking the video object is improved.
In step S1, the feature point of the video object contour in the current frame image may be a central point of the video object in the current frame image; accordingly, the matching feature point in step S2 is the matching center point of the video object in the next frame image.
In addition, there are many ways to describe the contours of video objects, such as invariant moments, eccentricity, aspect ratio of the video object, form factor, wavelet contour descriptors, etc. The wavelet contour descriptor has the advantages of definite physical significance, good retrieval performance, unchanged rotation and scaling and the like, and can accurately describe the contour characteristic value of the video object. Therefore, in the embodiment of the present invention, the wavelet contour descriptor is adopted as the contour feature value describing the contour of the video object.
The following describes a specific implementation process of the method for tracking a video object according to the embodiment of the present invention in detail with reference to fig. 3.
T1: and in the current frame image, carrying out contour detection on the video object to obtain contour points of the video object.
The various calculations performed in the current frame image are based on the previous frame of the current frame image. The calculations performed in the next frame of image are based on the current frame. Therefore, the principles of various calculations in the current frame and the next frame image are the same, and only the reference standard is different.
The method for detecting the contour points comprises the following steps: in the current frame image, the object index M is checkedk jAll connected bitmaps V within a defined rangekIf the gray value of one point in the upper, lower, left and right sides around a certain point is 0, the point is marked as a contour point.
T2: and obtaining the contour vector of the video object from the contour points.
Assuming that the video object has Np contour points in the current frame image, the contour vector is defined as
Figure GDA0000057822220000081
Figure GDA0000057822220000082
And after all contour points are found, sequencing the contour points. The method for sequencing the contour points comprises the following steps: indexing M from an object of examinationk jThe 1 st contour point which is searched horizontally from the upper edge of the enclosed area is the first contour point P0. Then using the first contour point P0As a center, the contour point found in the counterclockwise order by searching for a template such as 3 x 3 is the second contour point P1. Then using the second contour point P1Using the searching template to find out the third contour point P in the counter-clockwise direction as the center2. By analogy, the last contour point found is
Figure GDA0000057822220000083
Then again inUsing a 3 x 3 search template as the center, the first contour point found should be P0. The method for searching the contour points ignores the inner contour points of the video object, and the output contour vector only comprises the peripheral contour points of the video object.
Contour points are formed according to the method described aboveIs sequenced to obtain the contour vector
Figure GDA0000057822220000086
Wherein, Pn=(Pxn,Pyn)(n=0,...Np)。
After the contour vector is obtained, the centroid coordinate of the contour is calculated according to the formulas (1) and (2)
Figure GDA0000057822220000087
Figure GDA0000057822220000088
<math> <mrow> <msubsup> <mi>TX</mi> <mi>k</mi> <mi>j</mi> </msubsup> <mo>=</mo> <mfrac> <mn>1</mn> <msub> <mi>N</mi> <mi>p</mi> </msub> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <msub> <mi>N</mi> <mi>p</mi> </msub> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <msub> <mi>x</mi> <mi>n</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </math>
<math> <mrow> <msubsup> <mi>TY</mi> <mi>k</mi> <mi>j</mi> </msubsup> <mo>=</mo> <mfrac> <mn>1</mn> <msub> <mi>N</mi> <mi>p</mi> </msub> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <msub> <mi>N</mi> <mi>p</mi> </msub> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <msub> <mi>y</mi> <mi>n</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>2</mn> <mo>)</mo> </mrow> </mrow> </math>
Wherein (x)n,yn) N. N is the coordinate of each contour point, (N is 0, 1.)p-1)。
T3: calculating a normalized track vector with constant translation, rotation and scaling according to the contour vector
Figure GDA0000057822220000091
The calculation formulas are shown in the following (3), (4) and (5):
r n = ( x n - TX k j ) 2 + ( y n - TY k j ) 2 - - - ( 3 )
rmax=Max(r0,r1,...rN-1) (4)
Un=rn/rmax; (5)
wherein r isnFor the distance of each contour point to the centroid, rmaxN is the maximum of the distances of each contour point to the center of mass, N being 0, 1p-1。
T4: the obtained normalized wheel track vector
Figure GDA0000057822220000094
Reordering to obtain directional track vector
The method of reordering the normalized track vectors is as follows:
from said normalized track vector
Figure GDA0000057822220000096
Is/are as follows
Figure GDA0000057822220000097
In (3), all the maximum and minimum values are found. Assuming that J maxima and K minima are found, J x K "maximum-minimum value pairs" can be formed between these maxima and minima. And finding a pair of maximum-minimum value pairs with maximum subscript interval between the maximum value and the minimum value from the J × K maximum-minimum value pairs. Since the first term and the last term in the normalized track distance vector are adjacent on the outline of the video object, the interval between any two vectors can be kept at NpWithin/2. Therefore, if the distance d between two maximum and minimum values is larger than NpAnd/2, making d equal to Np/2。
If there is only one "max-min pair", then the minimum value of the "max-min pair" is the most appropriate for the directional track vectorThe first term q in0And such that the maximum is in the first N of the directional track vectorpWithin/2, sorting the normalized track distance vector according to the direction from the minimum value to the maximum value to obtain an oriented track distance vector
If there are multiple "max-min value pairs," then a comparison of the neighbors of the minimum or maximum values is used to determine which "max-min value pair" to select as a basis for calculating the directional track vector. For example, in a first pair of "max-min value pairs", the neighbors of the maxima are greater than the neighbors of the maxima in a second pair of "max-min value pairs", then the first pair of "max-min value pairs" will be the basis for calculating the directional track vector. If the maximum values of the adjacent terms are equal, the outline of the video object is symmetrical.
T5: vector the directional track
Figure GDA00000578222200000910
Length normalization is performed to form a length normalized directional track vector having a fixed length (e.g., length M1024)
Figure GDA0000057822220000101
The calculation formula is as follows:
a = [ i M N p ] ; - - - ( 6 )
b=[a+1]; (7)
c = i M N p - a ; - - - ( 8 )
Li=(1-c)×qa+c×qb,(i=0,1,......M-1);(9)
wherein a and b are integers, and c is a floating point number.
T6: normalizing the directional track vector by said length
Figure GDA0000057822220000104
Calculating to obtain a wavelet contour descriptor B of the video object in the current frame imagek={b0,b1,...bN-1}。
Normalizing said length oriented track vector
Figure GDA0000057822220000105
Harr wavelet transform is carried out to obtain Harr wavelet transform result
Figure GDA0000057822220000106
The specific Harr wavelet transform is implemented as follows:
a one-dimensional array L with a length m is provided, and m is a power of 2, the Harr wavelet transform for the array can be implemented by the following pseudo-code method:
Figure GDA0000057822220000107
by the method described above, the length-normalized directional track vector can be obtained
Figure GDA0000057822220000108
Figure GDA0000057822220000109
Is converted into
Figure GDA00000578222200001010
Both are equal in length. The schematic diagram of the result of the Harr wavelet transform is shown in fig. 4.
The wavelet contour descriptor B is different according to the image resolution Nn={b0 n,b1 n,...bN-1 nIt can be obtained from equation (10):
Bn={b0 n,b1 n,...bN-1 n}={w0 n,w1 n,...wN-1 n} (10)
as can be seen from equation (10), the wavelet profile descriptor Bn={b0 n,b1 n,...bN-1 nCutting the Harr wavelet transform result
Figure GDA0000057822220000111
The first N coefficients.
Figure 5 shows the results of the wavelet profile descriptors computed at resolutions 256, 64 and 16, respectively. In practical applications, in order to save the computation amount of the video object contour comparison, the resolution can be taken as 16.
Wavelet contour descriptor B of video object in current frame imagen={b0 n,b1 n,...bN-1 nAfter that, firstly, the appearance area of the video object in the next frame image needs to be determined, the candidate contour of the video object in the appearance area is calculated, and then the wavelet contour descriptor B of the candidate contour of the video object in the next frame image is calculatedn+1={b0 n+1,b1 n+1,...bN-1 n+1}。
The following describes the above calculation process in detail.
T7: and obtaining the appearance area of the video object in the next frame of image.
T71: and obtaining the central point of the video object in the current frame image.
T72: and calculating the matching center point of the video object in the next frame image according to the center point of the obtained video object in the current frame image.
In the embodiment of the invention, the matching central point of the video object in the next frame of image is calculated by adopting a mean shift method. Then, when calculating the center point of the video object in the current frame image, the center point is calculated by using the previous frame image of the current frame image as a reference, and the calculation principle is the same as the calculation process described below.
Suppose that
Figure GDA0000057822220000112
Representing the normalized pixel position of the video object model, wherein the center point coordinate is O; further quantizing the color grayscale values of the video objects to m levels, b (x) a mapping of the pixel at location x to a color index; the probability of the color u occurring is defined as:
<math> <mrow> <mover> <msub> <mi>q</mi> <mi>u</mi> </msub> <mo>&OverBar;</mo> </mover> <mo>=</mo> <mi>&alpha;</mi> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <mi>k</mi> <mrow> <mo>(</mo> <msup> <mrow> <mo>|</mo> <mo>|</mo> <msubsup> <mi>x</mi> <mi>i</mi> <mo>*</mo> </msubsup> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msup> <mo>)</mo> </mrow> <mi>&delta;</mi> <mo>[</mo> <mi>b</mi> <mrow> <mo>(</mo> <msubsup> <mi>x</mi> <mi>i</mi> <mo>*</mo> </msubsup> <mo>)</mo> </mrow> <mo>-</mo> <mi>u</mi> <mo>]</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>11</mn> <mo>)</mo> </mrow> </mrow> </math>
wherein: k (x) is a kernel function, and the pixel at a position farther from the center point has a smaller weight;
α is a constant, and its expression is:
<math> <mrow> <mi>&alpha;</mi> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <mi>k</mi> <mrow> <mo>(</mo> <msup> <mrow> <mo>|</mo> <mo>|</mo> <msubsup> <mi>x</mi> <mi>i</mi> <mo>*</mo> </msubsup> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msup> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>12</mn> <mo>)</mo> </mrow> </mrow> </math>
the video object model is then represented as:
<math> <mrow> <mover> <mi>q</mi> <mo>&OverBar;</mo> </mover> <mo>=</mo> <msub> <mrow> <mo>{</mo> <mover> <msub> <mi>q</mi> <mi>u</mi> </msub> <mo>&OverBar;</mo> </mover> <mo>}</mo> </mrow> <mrow> <mi>u</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mi>m</mi> </mrow> </msub> <mo>,</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>u</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <mover> <msub> <mi>q</mi> <mi>u</mi> </msub> <mo>&OverBar;</mo> </mover> <mo>=</mo> <mn>1</mn> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>13</mn> <mo>)</mo> </mrow> </mrow> </math>
suppose that
Figure GDA0000057822220000123
The pixel position of the candidate video object in the current frame is the center point C, and the same kernel function k (x) is applied in the range h from the center point, so that the probability of the occurrence of the color u in the candidate video object can be represented as:
<math> <mrow> <mover> <msub> <mi>p</mi> <mi>u</mi> </msub> <mo>&OverBar;</mo> </mover> <mrow> <mo>(</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>&alpha;</mi> <mi>h</mi> </msub> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>n</mi> <mi>h</mi> </msub> </munderover> <mi>k</mi> <mrow> <mo>(</mo> <msup> <mrow> <mo>|</mo> <mo>|</mo> <mfrac> <mrow> <mi>C</mi> <mo>-</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> </mrow> <mi>h</mi> </mfrac> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msup> <mo>)</mo> </mrow> <mi>&delta;</mi> <mo>[</mo> <mi>b</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mo>-</mo> <mi>u</mi> <mo>]</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>14</mn> <mo>)</mo> </mrow> </mrow> </math>
wherein: alpha is alphahIs a constant, and the expression is:
<math> <mrow> <msub> <mi>&alpha;</mi> <mi>h</mi> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>n</mi> <mi>h</mi> </msub> </munderover> <mi>k</mi> <mrow> <mo>(</mo> <msup> <mrow> <mo>|</mo> <mo>|</mo> <mfrac> <mrow> <mi>C</mi> <mo>-</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> </mrow> <mi>h</mi> </mfrac> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msup> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>15</mn> <mo>)</mo> </mrow> </mrow> </math>
the candidate video object model is then represented as:
<math> <mrow> <mover> <mi>p</mi> <mo>&OverBar;</mo> </mover> <mrow> <mo>(</mo> <mi>y</mi> <mo>)</mo> </mrow> <mrow> <mo>=</mo> <msub> <mrow> <mo>{</mo> <mover> <msub> <mi>p</mi> <mi>u</mi> </msub> <mo>&OverBar;</mo> </mover> <mrow> <mo>(</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>}</mo> </mrow> <mrow> <mi>u</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mi>m</mi> </mrow> </msub> <mo>,</mo> </mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>u</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <mover> <msub> <mi>p</mi> <mi>u</mi> </msub> <mo>&OverBar;</mo> </mover> <mo>=</mo> <mn>1</mn> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>16</mn> <mo>)</mo> </mrow> </mrow> </math>
from the above defined video object model and the candidate video object model, the distance d (c) between them can be calculated:
<math> <mrow> <mi>d</mi> <mrow> <mo>(</mo> <mi>C</mi> <mo>)</mo> </mrow> <mo>=</mo> <msqrt> <mn>1</mn> <mo>-</mo> <mi>&rho;</mi> <mo>[</mo> <mover> <mi>p</mi> <mo>&OverBar;</mo> </mover> <mrow> <mo>(</mo> <mi>C</mi> <mo>)</mo> </mrow> <mo>,</mo> <mover> <mi>q</mi> <mo>&OverBar;</mo> </mover> <mo>]</mo> </msqrt> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>17</mn> <mo>)</mo> </mrow> </mrow> </math>
wherein: <math> <mrow> <mover> <mi>&rho;</mi> <mo>&OverBar;</mo> </mover> <mrow> <mo>(</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>&equiv;</mo> <mi>&rho;</mi> <mo>[</mo> <mover> <mi>p</mi> <mo>&OverBar;</mo> </mover> <mrow> <mo>(</mo> <mi>C</mi> <mo>)</mo> </mrow> <mo>,</mo> <mover> <mi>q</mi> <mo>&OverBar;</mo> </mover> <mo>]</mo> <mo>=</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>u</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <msqrt> <mover> <msub> <mi>p</mi> <mi>u</mi> </msub> <mo>&OverBar;</mo> </mover> <mrow> <mo>(</mo> <mi>C</mi> <mo>)</mo> </mrow> <mover> <msub> <mi>q</mi> <mi>u</mi> </msub> <mo>&OverBar;</mo> </mover> </msqrt> <mo>.</mo> </mrow> </math>
from the above analysis, it can be seen that the best candidate video object of the video objects in the current image frame is the candidate video object closest to the video object model, i.e. the candidate region that minimizes the distance d (c). Therefore, the minimum value of d (C) is obtained, and the matching central point of the video object in the next frame image can be determined.
The calculation method of d (C) can be obtained according to the iterative formula (18),
<math> <mrow> <msub> <mover> <mi>C</mi> <mo>&OverBar;</mo> </mover> <mn>1</mn> </msub> <mo>=</mo> <mfrac> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>n</mi> <mi>h</mi> </msub> </munderover> <msub> <mi>x</mi> <mi>i</mi> </msub> <msub> <mi>w</mi> <mi>i</mi> </msub> <mi>g</mi> <mrow> <mo>(</mo> <msup> <mrow> <mo>|</mo> <mo>|</mo> <mfrac> <mrow> <msub> <mover> <mi>C</mi> <mo>&OverBar;</mo> </mover> <mn>0</mn> </msub> <mo>-</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> </mrow> <mi>h</mi> </mfrac> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msup> <mo>)</mo> </mrow> </mrow> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>n</mi> <mi>h</mi> </msub> </munderover> <msub> <mi>w</mi> <mi>i</mi> </msub> <mi>g</mi> <mrow> <mo>(</mo> <msup> <mrow> <mo>|</mo> <mo>|</mo> <mfrac> <mrow> <msub> <mover> <mi>C</mi> <mo>&OverBar;</mo> </mover> <mn>0</mn> </msub> <mo>-</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> </mrow> <mi>h</mi> </mfrac> <mo>|</mo> <mo>|</mo> </mrow> <mn>2</mn> </msup> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>18</mn> <mo>)</mo> </mrow> </mrow> </math>
wherein:
Figure GDA0000057822220000132
is the current center point of the video object,
Figure GDA0000057822220000133
the matching center point of the video object in the next frame image is obtained;
withe expression of (a) is:
<math> <mrow> <msub> <mi>w</mi> <mi>i</mi> </msub> <mo>=</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>u</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <msqrt> <mfrac> <mover> <msub> <mi>q</mi> <mi>u</mi> </msub> <mo>&OverBar;</mo> </mover> <mrow> <mover> <msub> <mi>p</mi> <mi>u</mi> </msub> <mo>&OverBar;</mo> </mover> <mrow> <mo>(</mo> <msub> <mi>y</mi> <mn>0</mn> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> </msqrt> <mi>&delta;</mi> <mo>[</mo> <mi>b</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mo>-</mo> <mi>u</mi> <mo>]</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>19</mn> <mo>)</mo> </mrow> </mrow> </math>
then, by applying the iterative formula (19) to each frame of image, the candidate video object and the center point thereof, which are the best candidates for the video object, for which d (c) is the minimum value, can be obtained, so as to obtain the matching center point of the video object in the next frame of image. By using the mean shift method, the speed of determining the matched feature points can be increased, and the efficiency of the whole tracking process can be improved. Of course, the mean shift method may not be used in the process of calculating the matching feature points of the video object in the next frame image. That is, after determining the feature point of the video object in the current frame image, the region where the feature point is likely to appear in the next frame image is determined first. And matching pixel points one by one in the possibly occurring region until the most matched feature point is obtained.
T73: after the matching central point of the video object in the next frame image is obtained, the occurrence area of the video object in the next frame image is predicted by using a linear method and taking the matching central point as the center. The linear method may include translation, rotation, and the like.
For example, with
Figure GDA0000057822220000135
Representing the outline range of the video object obtained from the current frame image, and the outline range of the video object in the next frame image, i.e. the occurrence region, can be used
Figure GDA0000057822220000136
And (4) showing.
The outline range specific prediction formula of the video object in the next frame image is as follows:
Left k + 1 = Left k - w k / 2 Top k + 1 = Top k - h k / 2 Right k + 1 = Right k + w k / 2 Bottom k + 1 = bottom k + h k / 2 - - - ( 20 )
wherein (CX)k,CYk) Is the coordinate of the center point of the video object in the current frame image, (CX)k+1,CYk+1) The coordinates of the matching center point of the video object in the next frame image can be obtained by calculation according to the mean shift method.
In the formula (20), in the following formula,
w k = ( Right k - Left k ) speed k - speed _ min speed _ max - speed _ min h k = ( Bottom k - Top k ) speed k - speed _ min speed _ max - speed _ min - - - ( 21 )
where speed _ min represents the minimum speed (typically 0) at which the video object moves, speed _ max represents the maximum speed (typically 1) at which the video object moves, and speed _ min represents the minimum speed (typically 1) at which the video object moveskRepresenting the actual speed of the object in the previous frame, the calculation formula is as follows:
speed k = ( CX k - CX k - N ) 2 + ( CY k - CY k - N ) 2 N ( Right k - N - Left k - N ) - - - ( 22 )
n represents the number of frames (typically 10) spaced between the two frames used to calculate the velocity.
It should be noted that the method for obtaining the matching center point of the video object in the next frame image is not limited to the mean shift method mentioned in the present embodiment. Any method capable of obtaining the center point of the video object in the image can be applied to the embodiment of the present invention.
T8: after obtaining the matching center point of the video object in the next frame image, in the appearance area, carrying out contour detection on the video object in the next frame image to obtain the candidate contour of the video object, and calculating the wavelet contour descriptor of the video object in the next frame image
B n + 1 = { b 0 n + 1 , b 1 n + 1 , . . . b N - 1 n + 1 } .
In this step, the process of calculating the wavelet contour descriptor of the video object in the next frame image is the same as the principle of the steps T1-T6 described above, and will not be described herein again.
T9: after the wavelet contour descriptors of the video objects in the current image frame and the next image frame are obtained, contour matching is carried out on the video objects to obtain the contour of the video objects in the next image frame, and therefore tracking of the video objects is completed.
The method for performing contour matching on the video object comprises the following steps:
t91: a wavelet contour descriptor B of the candidate contour in the appearance regionn+1={b0 n+1,b1 n+1,...bN-1 n+1And B, the wavelet contour descriptor B in the current frame imagen={b0 n,b1 n,...bN-1 nComparing, and calculating the similarity between the two. The similarity is calculated according to the following formula:
<math> <mrow> <mi>Similarity</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>=</mo> <mn>1</mn> <mo>-</mo> <mfrac> <mn>1</mn> <mi>N</mi> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>N</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <msup> <mrow> <mo>(</mo> <msubsup> <mi>b</mi> <mi>i</mi> <mrow> <mi>n</mi> <mo>-</mo> <mn>1</mn> </mrow> </msubsup> <mo>-</mo> <msubsup> <mi>b</mi> <mi>i</mi> <mi>n</mi> </msubsup> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>23</mn> <mo>)</mo> </mrow> </mrow> </math>
if the similarity value of the wavelet contour descriptors in the two previous and next frame images exceeds the similarity threshold value, then Bn+1={b0 n+1,b1 n+1,...bN-1 n+1Of those, the one with the highest similarity value is selected as Bn={b0 n,b1 n,...bN-1 nThe corresponding trace results in (f).
The similarity threshold value may be freely defined, and in this embodiment, in order to ensure accuracy of tracking a video object, the similarity threshold value takes a value of 80%.
If for Bn={b0 n,b1 n,...bN-1 nWill select the most similar one as pair Bn={b0 n,b1 n,...bN-1 nThe result of the tracking.
If for Bn={b0 n,b1 n,...bN-1 nIf the trace fails, B is indicatedn={b0 n,b1 n,...bN-1 nThe video object corresponding to the image is blocked or disappears; if B isn+1={b0 n+1,b1 n+1,...bN-1 n+1Find no source tracking object, then bN+1={b0 n+1,b1 n+1,...bN-1 n+1The video object corresponding to the image is a video object which newly appears or disappears in the next frame of image.
And finally, matching the video object contour corresponding to the best matching wavelet contour descriptor with the contour of the video object in the current frame image, wherein if the similarity of the video object contour and the contour of the video object in the current frame image exceeds a threshold value, the video object in the current frame image is successfully tracked, and otherwise, the tracking fails.
By using the method, even if the tracked video object is deformed in the motion process, the method provided by the embodiment of the invention can still accurately track the video object by using a mode of combining a mean shift method and contour matching.
The results of tracking the video object by using the method for tracking the video object according to the embodiment of the present invention are shown in fig. 6 and 7. As can be seen from the tracking result, the method for tracking a video object according to the embodiment of the present invention has a relatively ideal tracking effect, can track the contour of the video object relatively accurately, and even when the legs and arms of the pedestrian are bent or the like, the curve of the candidate contour and the real contour of the video object are relatively matched, for example, c) and d) in fig. 6.
The tracking process of the method for tracking the video object is stable, and the contour tracking can be stably carried out even if the movement speed of the object is changed greatly. As shown in fig. 7, when a car in the car video drives into the parking lot from fast to slow, the algorithm realizes stable tracking.
In addition, compared with the Condensation tracking algorithm, the video object tracking method provided by the embodiment of the invention has the advantages that the calculated amount is small, and the tracking speed is greatly improved. The tracking speed values when tracking is performed using the two tracking algorithms, respectively, are listed in table 1.
TABLE 1
It can be seen from the above experiments that the method for tracking a video object according to the embodiment of the present invention can not only accurately track a video object with affine change or non-affine change, but also improve the tracking speed of the video object compared with the prior art because the algorithm of the embodiment of the present invention is simple.
Corresponding to the method for tracking the video object in the embodiment of the invention, the embodiment of the invention also provides a device for tracking the video object.
As shown in fig. 8, an apparatus for tracking a video object according to an embodiment of the present invention includes:
a first positioning unit 801, configured to acquire feature points of the video object contour in the current frame image;
a first positioning unit 802, configured to find a matching feature point matching the feature point in a next frame of image;
a contour detection unit 803, configured to detect at least one candidate contour of the video object in a next frame image according to the matching feature points;
a first calculating unit 804, configured to calculate a contour feature value of the candidate contour;
a second calculating unit 805 configured to calculate a contour feature value of a video object in the current frame image;
a contour matching unit 806, configured to compare the contour feature value of the candidate contour with the contour feature value of the video object in the current frame image, and if the two contour feature values match, the candidate contour is the contour of the video object in the next frame image.
With the apparatus according to the embodiment of the present invention, first, the contour detection unit 803 determines a candidate contour of the video object in the next frame image, the first calculation unit 804 and the second calculation unit 805 calculate contour feature values of the video object in the previous frame image and the next frame image, respectively, the contour matching unit 806 matches the two contour feature values, and when the video object changes non-affine, accurate description of the contour of the video object can be obtained by matching the contour feature values of the video object in the previous frame image and the next frame image. When the video object is subjected to non-affine change, the contour characteristic value of the video object in the next frame image can be extracted, and the contour characteristic value which is most matched with the contour characteristic value of the video object in the current frame image is obtained by matching the contour characteristic value of the video object in the current frame image, so that the contour of the video object in the next frame image can be accurately described. By utilizing the device provided by the embodiment of the invention, the defect that the video object with non-affine change cannot be accurately tracked in the prior art is overcome.
Therefore, the device for tracking the video object in the embodiment of the invention not only can accurately track the video object with affine change, but also can track the video object with non-affine change, thereby improving the accuracy of tracking the video object.
Also, there are many kinds of contours describing video objects, such as invariant moments, eccentricity, aspect ratio of the video object, shape factor, wavelet contour descriptor, and the like. The wavelet contour descriptor has the advantages of definite physical significance, good retrieval performance, unchanged rotation and scaling and the like, and can accurately describe the contour characteristic value of a video object. Therefore, in the apparatus for tracking a video object in the embodiment of the present invention, the wavelet contour descriptor is used as the contour feature value for describing the contour of the video object.
As shown in fig. 9, the contour detection unit 803 includes:
a region prediction module 8031, configured to predict, according to the matching feature points, an occurrence region of the video object contour in a next frame image;
a contour extraction module 8032, configured to detect at least one candidate contour of the video object in the occurrence area.
The method has the advantages that the predicted occurrence area of the video object in the next frame of image is predicted, the contour of the video object can be selected in a targeted manner, the calculated amount of contour matching of the video object is reduced, and the speed and the efficiency of tracking the video object are improved.
The first calculating unit 804 includes:
a first contour detection module 8041, configured to perform contour detection on the video object in a current frame image to obtain a contour point of the video object;
a first normalized wheel track vector calculation module 8042, configured to obtain, from the contour points, a normalized wheel track vector calculation module of the video object;
a first directional contour vector calculation module 8043, configured to calculate a directional contour vector of the video object by the normalized track vector calculation module;
a first length normalization directional contour vector calculation module 8044, configured to perform length normalization on the directional contour vector to obtain a length normalization directional contour vector;
a first contour feature value calculation module 8045, configured to obtain a wavelet contour descriptor of the video object from the length-normalized orientation contour vector.
The second calculating unit 805 comprises:
an area prediction module 8051, configured to obtain an appearance area of the video object in a next frame image;
a second contour detection module 8052, configured to perform contour detection on the video object in the occurrence area to obtain a contour point of the video object;
a second normalized wheel track vector calculation module 8053, configured to obtain, from the contour point, a normalized wheel track vector calculation module of the video object;
a second directional contour vector calculation module 8054, configured to calculate a directional contour vector of the video object by the normalized track vector calculation module;
a second length normalization directional contour vector calculation module 8055, configured to perform length normalization on the directional contour vector to obtain a length normalization directional contour vector;
a second contour feature value calculating module 8056, configured to obtain a wavelet contour descriptor of the video object from the length-normalized orientation contour vector.
The algorithms used in the calculation process of the modules of the first calculation unit 804 and the second calculation unit 805 are the same as those used in the embodiment of the method for tracking a video object, and are not described herein again.
In summary, the method and the device for tracking the video object according to the embodiments of the present invention can not only improve the accuracy of tracking the video object, but also improve the speed of tracking the video object due to the simple algorithm of the embodiments of the present invention.
There are, of course, many possible embodiments of this invention and it is intended that all such other embodiments as may be obtained by those skilled in the art without departing from the spirit and scope of the invention and without any inventive step are deemed to be covered by the present invention and all such modifications, equivalents and alternatives falling within the scope and spirit of the invention.

Claims (21)

1. A method of tracking a video object, the method comprising the steps of:
taking characteristic points of the video object contour in the current frame image;
finding matched feature points matched with the feature points in the next frame of image;
detecting at least one candidate contour of the video object in a next frame of image according to the matched feature points, wherein the method comprises the following steps: taking the matched feature point as a center, obtaining an appearance region of the video object contour in a next frame image through linear transformation, and detecting at least one candidate contour of the video object in the appearance region;
calculating a contour characteristic value of a video object in the current frame image;
calculating contour characteristic values of the candidate contours;
and comparing the contour characteristic value of the candidate contour with the contour characteristic value of the video object in the current frame image, and if the contour characteristic value of the candidate contour is matched with the contour characteristic value of the video object in the next frame image, determining that the candidate contour is the contour of the video object in the next frame image.
2. A method for tracking a video object as claimed in claim 1, wherein the contour feature value is a wavelet contour descriptor, or an invariant moment of the contour, or eccentricity, or form factor.
3. The method according to claim 1, wherein the process of finding the matching feature point matching the feature point in the next frame image specifically comprises:
and finding matched feature points matched with the feature points in the next frame of image by using a mean shift method.
4. A method for tracking a video object as claimed in claim 1, wherein the process of detecting at least one candidate contour of the video object in the occurrence region is specifically:
in the appearance area, carrying out contour detection on the video object to obtain contour points of the video object;
sequencing the contour points to obtain a contour vector of the video object
Figure FDA0000079449570000011
Wherein, the P0For the first contour point,
Figure FDA0000079449570000012
is the NthpA contour point, NpIs an integer greater than 0.
5. A method for tracking a video object as claimed in claim 4, characterized in that the contour points are sorted to obtain a contour vector of the video object
Figure FDA0000079449570000013
The process specifically comprises the following steps:
starting from the upper edge of the range defined by the video object index, the first contour point searched in the horizontal direction is the first contour point P0
With the first contour point P0Using a search template as a center, searching in a range determined by the search template according to a counterclockwise direction to obtain a second contour point P1
The same steps are followed as for the second contour point from the first contour point until the Nth contour point is foundpA contour point
Figure FDA0000079449570000021
According to (first contour point P)0,.pA contour point
Figure FDA0000079449570000022
) To obtain the contour vector of the video object P k + 1 i = ( P 0 , . . . , P N p - 1 ) .
6. The method for tracking a video object according to claim 5, wherein the process of calculating the contour feature value of the candidate contour is specifically as follows:
from said contour vector
Figure FDA0000079449570000024
Obtaining a normalized track vector for the video objectWherein, U0Is the quotient of the distance of the first contour point to the center of mass of the contour of the video object in the current frame image and the maximum of the distances of the contour points to said center of mass, U1The quotient of the distance of the second contour point to the centroid of the contour of the video object in the current frame image and the maximum of the distances of the contour points to the centroid,
Figure FDA0000079449570000026
is the NthpThe quotient of the distance from each contour point to the centroid of the contour of the video object in the current frame image and the maximum value of the distances from each contour point to the centroid;
from said normalized track vector
Figure FDA0000079449570000027
Calculating a directional track vector for the video object
Figure FDA0000079449570000028
Wherein,
Figure FDA0000079449570000029
presentation pair
Figure FDA00000794495700000210
Reordering the results;
vector the directional track
Figure FDA00000794495700000211
Length normalization is carried out to obtain length normalization directional wheel track vector of the video object
Figure FDA00000794495700000212
Wherein L is0...LM-1Presentation pair
Figure FDA00000794495700000213
The result of length normalization is carried out;
normalizing the directional track vector by said length
Figure FDA00000794495700000214
Obtaining a wavelet contour descriptor B of the video objectk+1={b0,b1,...bN-1};b0...bN-1Represents L0,L1......LM-1N represents the coefficient length of the truncated wavelet transform result;
wherein N ispTo form said contour vector
Figure FDA00000794495700000215
The number of contour points, M being a length-normalized directional track vector
Figure FDA00000794495700000216
The length factor of (c).
7. A method of tracking video objects as claimed in claim 6, characterized by the fact that said contour vectors are derived from said contour vectors
Figure FDA00000794495700000217
Obtaining a normalized track vector for the video object
Figure FDA00000794495700000218
Figure FDA0000079449570000031
The process specifically comprises the following steps:
calculating the centroid coordinates of the contour from the contour vectorsThe calculation formula of the centroid coordinate is as follows:
<math> <mrow> <msubsup> <mi>TX</mi> <mrow> <mi>k</mi> <mo>+</mo> <mn>1</mn> </mrow> <mi>i</mi> </msubsup> <mo>=</mo> <mfrac> <mn>1</mn> <msub> <mi>N</mi> <mi>p</mi> </msub> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <msub> <mi>N</mi> <mi>p</mi> </msub> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <msub> <mi>x</mi> <mi>n</mi> </msub> <mo>,</mo> </mrow> </math> <math> <mrow> <msubsup> <mi>TY</mi> <mrow> <mi>k</mi> <mo>+</mo> <mn>1</mn> </mrow> <mi>j</mi> </msubsup> <mo>=</mo> <mfrac> <mn>1</mn> <msub> <mi>N</mi> <mi>p</mi> </msub> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <msub> <mi>N</mi> <mi>p</mi> </msub> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <msub> <mi>y</mi> <mi>n</mi> </msub> <mo>,</mo> </mrow> </math>
wherein (x)n,yn) N. N is the coordinate of each contour point, (N is 0, 1.)p-1);
Calculating a normalized track vector U k + 1 i = ( U 0 , U 1 , . . . . . . U N p 1 ) :
r n = ( x n - TX k + 1 j ) 2 + ( y n - TY k + 1 j ) 2
r max = Max ( r 0 , r 1 , . . . r N p - 1 )
Un=rn/rmax(n=0,......Np-1);
Wherein r isnFor the distance of each contour point to the centroid, rmaxThe maximum value of the distances from each contour point to the centroid.
8. A method of tracking video objects as claimed in claim 6, characterized by the fact that said normalized track vectorCalculating a directional track vector for the video object <math> <mrow> <msubsup> <mi>Q</mi> <mrow> <mi>k</mi> <mo>+</mo> <mn>1</mn> </mrow> <mi>i</mi> </msubsup> <mo>=</mo> <mrow> <mo>(</mo> <msub> <mi>q</mi> <mn>0</mn> </msub> <mo>,</mo> <msub> <mi>q</mi> <mn>1</mn> </msub> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <msub> <mi>q</mi> <mrow> <msub> <mi>N</mi> <mi>p</mi> </msub> <mo>-</mo> <mn>1</mn> </mrow> </msub> <mo>)</mo> </mrow> </mrow> </math> The process specifically comprises the following steps:
at the normalized track pitch vector
Figure FDA00000794495700000310
Finding the maximum value and the minimum value to form a maximum value-minimum value pair;
finding out a pair with the maximum value and the minimum value subscript having the maximum interval from the maximum value-minimum value pair;
sorting the maximum-minimum value pairs with the maximum subscript interval according to the sequence of the minimum value to the maximum value to obtain the directional wheel track vector
Figure FDA00000794495700000311
Wherein the order of "minimum-maximum" refers to the minimum starting with the maximum preceding NpOrdering within/2.
9. A method for tracking video objects as claimed in claim 6, characterized in that said directional track vector is adapted to
Figure FDA00000794495700000312
Length normalization is carried out to obtain length normalization directional wheel track vector of the video object
Figure FDA00000794495700000313
The process specifically comprises the following steps:
a = [ i M N p ] ;
b=[a+1];
c = i M N p - a ;
Li=(1-c)×qa+c×qb,(i=0,1,......M-1);
wherein a, b and c are constants.
10. A method for tracking video objects as claimed in claim 6, characterized in that the directional track vector is normalized by the length
Figure FDA0000079449570000042
Obtaining a wavelet contour descriptor B of the video objectk+1={b0,b1,...bN-1The process is concretely as follows:
normalizing said length oriented track vector
Figure FDA0000079449570000043
Performing wavelet transformation to obtain wavelet transformation result
Figure FDA0000079449570000044
w0,w1,...wM-1Is represented by L0,L1......LM-1Performing wavelet transform;
intercepting the transformation result according to the difference of image resolution
Figure FDA0000079449570000045
To obtain Bk+1={b0,b1,...bN-1}=(w0,w1,...wN-1) The number of truncated coefficients is the same as the value of the resolution.
11. The method for tracking a video object according to claim 2, wherein when the contour feature value is a wavelet contour descriptor, the process of calculating the contour feature value of the video object in the current frame image specifically comprises:
in the current frame image, carrying out contour detection on the video object to obtain a contour vector of the video object P k i = ( P 0 , . . . , P N p - 1 ) , Coordinates of each contour point;
from said contour vectorObtaining a normalized track vector for the video object
Figure FDA00000794495700000410
Wherein, U0The maximum of the distance from the first contour point to the centroid of the contour of the video object in the current frame image and the distance from each contour point to the centroidThe quotient of the values is determined by the quotient,
Figure FDA00000794495700000411
is the NthpThe quotient of the distance from the outline point to the centroid of the outline of the video object in the current frame image and the maximum value of the distances from each outline point to the centroid;
from said normalized track vector
Figure FDA00000794495700000412
Calculating a directional track vector for the video object
Figure FDA00000794495700000413
Wherein,presentation pair
Figure FDA00000794495700000415
Reordering the results;
vector the directional track
Figure FDA00000794495700000416
Length normalization is carried out to obtain length normalization directional wheel track vector of the video object
Figure FDA00000794495700000417
L0...LM-1Presentation pair
Figure FDA00000794495700000418
Figure FDA00000794495700000419
The result of length normalization is carried out;
normalizing the directional track vector by said lengthObtaining a wavelet contour descriptor B of the video objectk={b0,b1,...bN-1};b0...bN-1Represents L0,L1......LM-1The wavelet profile descriptor of (a);
wherein N ispTo form said contour vector
Figure FDA0000079449570000052
The number of contour points, M being a length-normalized directional track vector
Figure FDA0000079449570000053
Length coefficient of (d); n denotes the coefficient length of the truncated wavelet transform result.
12. A method for tracking a video object as recited in claim 11, wherein the contour vector of the video object is derived from the contour points
Figure FDA0000079449570000054
The process specifically comprises the following steps:
starting from the upper edge of the range defined by the video object index, the first contour point searched in the horizontal direction is the first contour point P0
With the first contour point P0Using a search template as a center, searching in a range determined by the search template according to a counterclockwise direction to obtain a second contour point P1
The same steps are followed as for the second contour point from the first contour point until the Nth contour point is foundpA contour point
Figure FDA0000079449570000055
According to (first contour point P)0,.pA contour point) To obtain the contour vector of the video object P k i = ( P 0 , . . . , P N p - 1 ) .
13. A method of tracking a video object as defined in claim 11, wherein the contour vector is derived from the contour vector
Figure FDA0000079449570000058
Calculating a normalized track vector for the video object
Figure FDA0000079449570000059
Figure FDA00000794495700000510
The process specifically comprises the following steps:
from said contour vector
Figure FDA00000794495700000511
Calculating the coordinates of the centroid of the video object contourThe coordinates of the center of mass
Figure FDA00000794495700000513
The calculation formula of (2) is as follows:
<math> <mrow> <msubsup> <mi>TX</mi> <mrow> <mi>k</mi> <mo>+</mo> <mn>1</mn> </mrow> <mi>i</mi> </msubsup> <mo>=</mo> <mfrac> <mn>1</mn> <msub> <mi>N</mi> <mi>p</mi> </msub> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <msub> <mi>N</mi> <mi>p</mi> </msub> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <msub> <mi>x</mi> <mi>n</mi> </msub> <mo>,</mo> </mrow> </math> <math> <mrow> <msubsup> <mi>TY</mi> <mrow> <mi>k</mi> <mo>+</mo> <mn>1</mn> </mrow> <mi>j</mi> </msubsup> <mo>=</mo> <mfrac> <mn>1</mn> <msub> <mi>N</mi> <mi>p</mi> </msub> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>n</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <msub> <mi>N</mi> <mi>p</mi> </msub> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <msub> <mi>y</mi> <mi>n</mi> </msub> <mo>,</mo> </mrow> </math>
wherein (x)n,yn) N is the coordinate of each contour point, 0, 1p-1;
According to the coordinates of the center of mass
Figure FDA00000794495700000516
Calculating a normalized track vector
Figure FDA00000794495700000517
Figure FDA00000794495700000518
r n = ( x n - TX k j ) 2 + ( y n - TY k j ) 2
rmax=Max(r0,r1,...rN-1)
Un=rn/rmax
Wherein r isnFor the distance of each contour point to the centroid, rmaxN is the maximum of the distances of each contour point to the center of mass, N being 0, 1p-1。
14. A method of tracking video objects as claimed in claim 11, characterized by the fact that said normalized track vector
Figure FDA0000079449570000061
Calculating a directional track vector for the video object <math> <mrow> <msubsup> <mi>Q</mi> <mi>k</mi> <mi>i</mi> </msubsup> <mo>=</mo> <mrow> <mo>(</mo> <msub> <mi>q</mi> <mn>0</mn> </msub> <mo>,</mo> <msub> <mi>q</mi> <mn>1</mn> </msub> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <msub> <mi>q</mi> <mrow> <msub> <mi>N</mi> <mi>p</mi> </msub> <mo>-</mo> <mn>1</mn> </mrow> </msub> <mo>)</mo> </mrow> </mrow> </math> The process specifically comprises the following steps:
at the normalized track pitch vector
Figure FDA0000079449570000063
Finding the maximum value and the minimum value to form a maximum value-minimum value pair;
finding out a pair with the maximum value and the minimum value subscript having the maximum interval from the maximum value-minimum value pair;
sorting the maximum-minimum value pairs with the maximum subscript interval according to the sequence of the minimum value to the maximum value to obtain the directional wheel track vector <math> <mrow> <msubsup> <mi>Q</mi> <mi>k</mi> <mi>i</mi> </msubsup> <mo>=</mo> <mrow> <mo>(</mo> <msub> <mi>q</mi> <mn>0</mn> </msub> <mo>,</mo> <msub> <mi>q</mi> <mn>1</mn> </msub> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <msub> <mi>q</mi> <mrow> <msub> <mi>N</mi> <mi>p</mi> </msub> <mo>-</mo> <mn>1</mn> </mrow> </msub> <mo>)</mo> </mrow> <mo>;</mo> </mrow> </math>
Wherein the order of "minimum-maximum" means that the minimum is the starting term to ensure the maximumLarge value preceding NpOrdering within/2.
15. A method of tracking video objects as claimed in claim 11, characterized by tracking the directional track vectorLength normalization is carried out to obtain length normalization directional wheel track vector <math> <mrow> <msubsup> <mi>L</mi> <mi>k</mi> <mi>i</mi> </msubsup> <mo>=</mo> <mrow> <mo>(</mo> <msub> <mi>L</mi> <mn>0</mn> </msub> <mo>,</mo> <msub> <mi>L</mi> <mn>1</mn> </msub> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <msub> <mi>L</mi> <mrow> <mi>M</mi> <mo>-</mo> <mn>1</mn> </mrow> </msub> <mo>)</mo> </mrow> </mrow> </math> The process specifically comprises the following steps:
a = [ i M N p ] ;
b=[a+1];
c = i M N p - a ;
Li=(1-c)×qa+c×qb,(i=0,1,......M-1);
wherein a, b and c are constants.
16. A method for tracking video objects as claimed in claim 11, wherein the directional track vector is normalized by the length
Figure FDA0000079449570000069
Obtaining a wavelet contour descriptor B of the video objectk={b0,b1,...bN-1The process is concretely as follows:
normalizing said length oriented track vector
Figure FDA00000794495700000610
Performing wavelet transformation to obtain transformation result
Figure FDA00000794495700000611
w0,w1,...wM-1Is represented by L0,L1......LM-1Performing wavelet transform;
intercepting the transformation result according to the difference of image resolution
Figure FDA0000079449570000071
To obtain Bk={b0,b1,...bN-1}=(w0,w1,...wN-1) The number of truncated coefficients is the same as the value of the resolution.
17. The method according to claim 2, wherein when the contour feature value is a wavelet contour descriptor, the process of comparing the contour feature value of the candidate contour with the contour feature value of the video object in the current frame image specifically comprises:
comparing the wavelet contour descriptors of all candidate contours of the video object in the next frame image with the similarity of the wavelet contour descriptors of the video object in the current frame image;
and if the similarity value of the wavelet contour descriptors of the front frame and the rear frame exceeds a similarity threshold value, taking the candidate object contour in the next frame image as the contour of the tracked video object in the current frame image.
18. The method of video object tracking according to claim 17, wherein the similarity value is calculated by:
<math> <mrow> <mi>Similarity</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>=</mo> <mn>1</mn> <mo>-</mo> <mfrac> <mn>1</mn> <mi>N</mi> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>N</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <msup> <mrow> <mo>(</mo> <msubsup> <mi>b</mi> <mi>i</mi> <mrow> <mi>k</mi> <mo>-</mo> <mn>1</mn> </mrow> </msubsup> <mo>-</mo> <msubsup> <mi>b</mi> <mi>i</mi> <mi>k</mi> </msubsup> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>,</mo> </mrow> </math>
wherein,
Figure FDA0000079449570000073
a wavelet profile descriptor representing said video object in the current frame image,
Figure FDA0000079449570000074
and a wavelet contour descriptor representing the video object in the next frame image, and N represents the coefficient length of the truncated wavelet transform result.
19. An apparatus for tracking video objects, the apparatus comprising:
the first positioning unit is used for acquiring the characteristic points of the video object outline in the current frame image;
the second positioning unit is used for finding matched feature points matched with the feature points in the next frame of image;
the contour detection unit is used for detecting at least one candidate contour of the video object in the next frame of image according to the matched feature points, and comprises: the region prediction module is used for obtaining the appearance region of the video object contour in the next frame image through linear transformation by taking the matched feature point as a center; a contour selection module for detecting at least one candidate contour of the video object within the occurrence region;
the first calculating unit is used for calculating the contour characteristic value of the video object in the current frame image;
a second calculation unit, configured to calculate a contour feature value of the candidate contour;
and the contour matching unit is used for comparing the contour characteristic value of the candidate contour with the contour characteristic value of the video object in the current frame image, and if the contour characteristic value of the candidate contour is matched with the contour characteristic value of the video object in the next frame image, the candidate contour is the contour of the video object in the next frame image.
20. An apparatus for tracking video objects according to claim 19, wherein said first computing unit comprises:
the first contour detection module is used for carrying out contour detection on the video object in a current frame image to obtain contour points of the video object;
the first normalized wheel track vector calculation module is used for obtaining a normalized wheel track vector of the video object from the contour points;
the first directional contour vector calculation module is used for obtaining a directional contour vector of the video object by the normalized wheel track vector calculated by the normalized wheel track vector calculation module;
the first length normalization directional contour vector calculation module is used for carrying out length normalization on the directional contour vector to obtain a length normalization directional contour vector;
and the first contour characteristic value calculation module is used for obtaining the wavelet contour descriptor of the video object from the length normalization orientation contour vector.
21. An apparatus for tracking video objects according to claim 19, wherein said second computing unit comprises:
the second normalized wheel track vector calculation module is used for obtaining a normalized wheel track vector of the video object from the contour points;
the second directional contour vector calculation module is used for obtaining the directional contour vector of the video object by the normalized wheel track vector calculated by the normalized wheel track vector calculation module;
the second length normalization directional contour vector calculation module is used for carrying out length normalization on the directional contour vector to obtain a length normalization directional contour vector;
and the second contour characteristic value calculation module is used for obtaining the wavelet contour descriptor of the video object from the length normalization orientation contour vector.
CN2008100005828A 2008-01-23 2008-01-23 Method and apparatus for tracking video object Active CN101493889B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008100005828A CN101493889B (en) 2008-01-23 2008-01-23 Method and apparatus for tracking video object

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008100005828A CN101493889B (en) 2008-01-23 2008-01-23 Method and apparatus for tracking video object

Publications (2)

Publication Number Publication Date
CN101493889A CN101493889A (en) 2009-07-29
CN101493889B true CN101493889B (en) 2011-12-07

Family

ID=40924481

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008100005828A Active CN101493889B (en) 2008-01-23 2008-01-23 Method and apparatus for tracking video object

Country Status (1)

Country Link
CN (1) CN101493889B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20110111106A (en) * 2010-04-02 2011-10-10 삼성테크윈 주식회사 Method and apparatus for object tracking and loitering
CN102469350A (en) * 2010-11-16 2012-05-23 北大方正集团有限公司 Method, device and system for advertisement statistics
CN102129684B (en) * 2011-03-17 2012-12-12 南京航空航天大学 Method for matching images of different sources based on fit contour
US9582896B2 (en) * 2011-09-02 2017-02-28 Qualcomm Incorporated Line tracking with automatic model initialization by graph matching and cycle detection
CN104182751B (en) * 2014-07-25 2017-12-12 小米科技有限责任公司 Object edge extracting method and device
US9860553B2 (en) * 2015-03-18 2018-01-02 Intel Corporation Local change detection in video
CN105261040B (en) * 2015-10-19 2018-01-05 北京邮电大学 A kind of multi-object tracking method and device
WO2017166098A1 (en) * 2016-03-30 2017-10-05 Xiaogang Wang A method and a system for detecting an object in a video
CN106326928B (en) * 2016-08-24 2020-01-07 四川九洲电器集团有限责任公司 Target identification method and device
CN106548487B (en) * 2016-11-25 2019-09-03 浙江光跃环保科技股份有限公司 Method and apparatus for detection and tracking mobile object
CN106583955B (en) * 2016-12-13 2019-05-03 鸿利智汇集团股份有限公司 A kind of wire soldering method of detection chip fixed-direction
CN109035205A (en) * 2018-06-27 2018-12-18 清华大学苏州汽车研究院(吴江) Water hyacinth contamination detection method based on video analysis
CN109977833B (en) * 2019-03-19 2021-08-13 网易(杭州)网络有限公司 Object tracking method, object tracking device, storage medium, and electronic apparatus
CN112102342B (en) * 2020-09-01 2023-12-01 腾讯科技(深圳)有限公司 Plane contour recognition method, plane contour recognition device, computer equipment and storage medium
CN112297028A (en) * 2020-11-05 2021-02-02 中国人民解放军海军工程大学 Overwater U-shaped intelligent lifesaving robot control system and method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1187092A (en) * 1996-12-31 1998-07-08 大宇电子株式会社 Contour tracing method
CN101009021A (en) * 2007-01-25 2007-08-01 复旦大学 Video stabilizing method based on matching and tracking of characteristic

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1187092A (en) * 1996-12-31 1998-07-08 大宇电子株式会社 Contour tracing method
CN101009021A (en) * 2007-01-25 2007-08-01 复旦大学 Video stabilizing method based on matching and tracking of characteristic

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
牛德娇,詹永照,宋顺林.实时视频图像中的人脸检测与跟踪.《计算机应用》.2004,第24卷(第6期),105-107. *

Also Published As

Publication number Publication date
CN101493889A (en) 2009-07-29

Similar Documents

Publication Publication Date Title
CN101493889B (en) Method and apparatus for tracking video object
CN108154118B (en) A kind of target detection system and method based on adaptive combined filter and multistage detection
CN109903313B (en) Real-time pose tracking method based on target three-dimensional model
CN107680120B (en) Infrared small target tracking method based on sparse representation and transfer limited particle filtering
CN104200495B (en) A kind of multi-object tracking method in video monitoring
JP4625074B2 (en) Sign-based human-machine interaction
CN111028292B (en) Sub-pixel level image matching navigation positioning method
JP5063776B2 (en) Generalized statistical template matching based on geometric transformation
CN115240130A (en) Pedestrian multi-target tracking method and device and computer readable storage medium
WO2018049704A1 (en) Vehicle detection, tracking and localization based on enhanced anti-perspective transformation
US11475595B2 (en) Extrinsic calibration of multi-camera system
CN111914756A (en) Video data processing method and device
CN108537822B (en) Moving target tracking method based on weighted confidence estimation
Kottath et al. Mutual information based feature selection for stereo visual odometry
CN111914627A (en) Vehicle identification and tracking method and device
Khan et al. Bayesian online learning on Riemannian manifolds using a dual model with applications to video object tracking
CN112991394A (en) KCF target tracking method based on cubic spline interpolation and Markov chain
Hongpeng et al. A robust object tracking algorithm based on surf and Kalman filter
Dang et al. Fast object hypotheses generation using 3D position and 3D motion
Rhee et al. Vehicle tracking using image processing techniques
Guo et al. A hybrid framework based on warped hierarchical tree for pose estimation of texture-less objects
CN107909608A (en) The moving target localization method and device suppressed based on mutual information and local spectrum
Yin et al. Real-time head pose estimation for driver assistance system using low-cost on-board computer
Li et al. Adaptive object tracking algorithm based on eigenbasis space and compressive sampling
Asif et al. AGV guidance system: An application of simple active contour for visual tracking

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant