CN110570451A - multithreading visual target tracking method based on STC and block re-detection - Google Patents

multithreading visual target tracking method based on STC and block re-detection Download PDF

Info

Publication number
CN110570451A
CN110570451A CN201910716977.6A CN201910716977A CN110570451A CN 110570451 A CN110570451 A CN 110570451A CN 201910716977 A CN201910716977 A CN 201910716977A CN 110570451 A CN110570451 A CN 110570451A
Authority
CN
China
Prior art keywords
target
image
frame image
stc
tracking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910716977.6A
Other languages
Chinese (zh)
Other versions
CN110570451B (en
Inventor
汪鼎文
陈曦
王泉德
孙世磊
瞿涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201910716977.6A priority Critical patent/CN110570451B/en
Publication of CN110570451A publication Critical patent/CN110570451A/en
Application granted granted Critical
Publication of CN110570451B publication Critical patent/CN110570451B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/223Analysis of motion using block-matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

the invention provides a multithreading visual target tracking method based on STC and block re-detection, which comprises the following steps: s1, reading the first frame image and determining a tracking target; s2, establishing a space context model for the first frame image by adopting an STC algorithm; s3, performing blocking operation on a rectangular template area where a tracking target in the first frame image is located, and training an SVM classifier; s4, reading the next frame of image, learning from the previous frame of image to obtain a spatial context model, and calculating a target neighborhood context prior; s5, updating the space-time context model of the current frame image; s6, acquiring a confidence map of the current frame image; s7, judging the occlusion degree of the tracking target in the current frame image according to the confidence probability; s8, selecting a corresponding processing strategy according to the judged shielding condition of the tracked target; and S9, circularly executing the steps S4-S8 until the current video or image sequence is processed. The invention can improve the reliability and efficiency of target tracking.

Description

Multithreading visual target tracking method based on STC and block re-detection
Technical Field
The invention relates to the field of computer vision application target detection and tracking, in particular to a multithreading vision target tracking method based on STC and block re-detection.
background
visual target tracking is an important research direction in computer vision, and has extremely wide application in the fields of military unmanned aircrafts, accurate guidance, air early warning, civil video monitoring, man-machine interaction, unmanned driving and the like, however, the target tracking faces the challenges of target scale change, severe shielding, rapid movement, beyond-view field, illumination change and the like, so that the reliable real-time visual target tracking method has very important practical significance.
The traditional visual target tracking method effectively adopts time context information, tracks the position and the scale of a target in a current frame according to the position and the scale information of an image target of a previous frame, establishes a space-time relation between a tracked target and a local context based on a Bayesian frame on the basis of STC, converts the tracking problem into a calculation confidence map, obtains the problem of the target position by maximizing a target position likelihood function, has good tracking effect on slight shielding, posture change and illumination change of the target, can meet the requirement of real-time processing, and easily causes tracking failure under the conditions of heavy shielding, quick movement, beyond visual field and the like of the target in the long-term tracking process. Considering that the time interval between two adjacent frames of images is short, even if the tracking target is seriously shielded, the local area near the target still has no change, the LCT adopts a correlation filter-based kernel ridge regression method to code an appearance template consisting of the target object and the surrounding environment thereof, the self-adaptive template constructed by the extracted features can resist the severe shielding, the rapid movement and the serious deformation, and the problem that the traditional correlation filter-based tracking method generates drift in the long-time tracking process is solved. In addition, the LCT trains another related filter to estimate the target scale change, and a multi-scale target pyramid is constructed by adopting the HOG characteristics to search the optimal target scale in detail. For the case that the tracking fails due to the long-term severe occlusion and beyond of the field of view of the target during the long-term tracking process, which requires re-detection, the LCT trains the online detector by using a random fern classifier (random fern) and scans the window when activated. However, the target tracking only by adopting the LCT algorithm has the problem of extremely low tracking speed.
Disclosure of Invention
The invention aims to provide a multithreading visual target tracking method based on STC and block re-detection, and aims to solve the problems that tracking failure and low tracking speed are easily caused under the conditions of heavy target shielding, quick movement, beyond visual field and the like in the long-term target tracking process of the conventional target tracking method.
the invention provides a multithreading visual target tracking method based on STC and block re-detection, which comprises the following steps:
S1, reading a first frame image of the video or image sequence, and determining a tracking target;
S2, establishing a space context model for the first frame image by adopting an STC algorithm;
s3, performing blocking operation on the rectangular template area where the tracking target in the first frame image is located, and training an SVM classifier by adopting blocks of the artificially selected target rectangular area in the first frame image;
s4, reading the next frame of image, and learning from the previous frame of image to obtain a spatial context model;
S5, updating the space-time context model of the current frame image according to the space context model learned from the previous frame image in the step S4;
S6, acquiring a confidence map of the current frame image, and acquiring the target position and the confidence probability of the current frame image by maximizing the confidence map;
s7, judging the occlusion degree of the tracking target in the current frame image according to the confidence probability obtained in the step S6;
s8, selecting a corresponding algorithm to update the target position according to the shielding degree of the tracking target judged in the step S7;
And S9, circularly executing the steps S4-S8 until the current video or image sequence is processed.
Further, the blocking performed on the first frame image in step S3 includes vertical blocking in which each sub-region is a rectangular region half-height and one-tenth width of the template region, and horizontal blocking in which each sub-region is a rectangular region half-width and one-tenth height of the template region, and is divided into forty sub-regions in total.
further, the step S4 specifically includes,
Setting the current frame image as t +1 th frame image, and determining the target position x of the previous frame image, i.e. the t th frame image*learning from the t frame image to obtain a spatial context modelThe specific method comprises the following steps:
The target neighborhood context prior probability is calculated by:
p(c(z)|o)=I(z)wσ(z-x*)
wherein p (c (z) | o) is the prior probability of the context of the target neighborhood and describes the characteristics of a local image region, z is the position of each target neighborhood, and c (z) is the context characteristic set X in the current framec={c(z)=(I(z),z)|z∈Ωc(x*) O is the tracking target appearing in the current scene, I (z) is the image gray value at position z, wσ(. cndot.) is a weight function defined by:
Wherein a is a normalization constant, so that the value of p (c (z) | o) is within the interval [0,1], and σ is a scale parameter;
The target location likelihood probability confidence map formula is as follows:
whereinFor convolution operation, x ∈ R2Is the target position, b is the normalized constant, β is the shape parameter, Ωc(x*) Denotes x*a local context area of (a);
To perform fast convolution calculations using FFT, the above equation is transformed to the frequency domain:
whereinfor FFT transformation, an element-to-bit multiplication operation;
deriving a spatial context model h from the above equationsc(x) The calculation formula (c) is as follows:
WhereinIs an inverse FFT transformation.
Further, the step S5 specifically includes,
Spatial context model learned from t-th frame image by step S4The spatio-temporal context model used for updating the t +1 th frame image is as follows:
where p is a learning parameter, where,Is a spatial context model learned from the image of the t-th frame,Is a spatio-temporal context model cumulatively obtained in the first t frames of a video or image sequence.
Further, the formula for obtaining the confidence map of the current frame image in step S6 is as follows:
Indicating the target location in the t-th frame.
further, in the step S7, when the confidence probability value is greater than a1, it is determined that the tracked target is not occluded; when the confidence probability value is between a2 and a1, judging that the tracking target is in general occlusion; when the confidence probability value is less than a2, judging that the tracking target is seriously shielded or exceeds the field of vision; wherein 1> a1> a2> 0; wherein a1 is 0.75, and a2 is 0.3.
Further, in step S8, (1) if it is determined that the tracked target is not occluded, the current frame image continues to perform target tracking by using STC, and the confidence map c obtained in step S6 is maximizedt+1(x) To obtain the target position
That is, the target position of the current frame image obtained by maximizing the confidence map in step S6 is the final target position;
And updating the target scale, wherein the target size may change in the video image sequence, so the weight function wσIn (1)The scale parameter σ should also be updated, and the update strategy for σ is as follows:
Wherein c ist(. is a confidence distribution, s'tis the target scale estimated from two consecutive frames of images,is the average of the scale estimates, λ, of the previous n successive frames of the image>0 is a given filter parameter;
meanwhile, the rectangular template area where the tracking target is located in the current frame image is subjected to blocking operation, the integral histogram is adopted to extract the gray level histogram of each sub-rectangular area, and the image HOG characteristic of the area where the target is located is used for training an SVM classifier;
(2) if the tracked target is judged to be in general occlusion, continuously adopting STC to update the target position, further performing blocking operation on a rectangular template region where the tracked target is located in the current frame image when the confidence probability value is less than 0.7, extracting a gray level histogram of each sub-rectangular region by adopting an integral histogram, and training an SVM classifier by using image HOG characteristics of the region where the target is located;
(3) If the tracked target is judged to be seriously shielded or exceed the field of vision, the condition that the tracked target is shielded in the current frame image frame is further judged by adopting the blocks, and the method specifically comprises the following steps:
Defining a search range with the radius r by taking the target position obtained by the previous frame of image as the center, wherein for each position (x, y) in the search range, a rectangular target area with the (x, y) as the center corresponds to the target area of the previous frame, adopting blocking operation to rectangularly block the rectangular target area where the position (x, y) is located, and adopting an integral histogram to extract a gray histogram of each block;
Calculating the similarity between the blocks by adopting EMD, obtaining a similarity map which is close to the size of a search area for each block, obtaining an EMD value corresponding to a sub-block with the highest similarity with the current sub-area block in the range of the search area by minimizing the similarity map, sequencing the values corresponding to all the blocks from small to large, updating the target position by adopting STC when the fifth value is smaller than a set threshold value and the shielded degree of the target is lighter, otherwise, scoring by adopting an SVM classifier when the target is seriously shielded or exceeds the visual field, and repositioning the target position by maximizing the scoring map.
Further, EMD is defined as follows:
Setting:
Wherein:
piFeatures representing a picture, qja feature representing another picture is shown in the figure,represents a feature pithe weight of (a) is determined,representative of characteristic qjweight of dijRepresents piand q isjthe distance between them;
The problem is optimized by:
s.t
fij≥0,i=1,2,...,M;j=1,2,...,N;
M, N show the number of combinations of features and weights in P, Q sets, respectively, to solve for fijthe definition of EMD is:
when:
When the histogram is similar to the histogram, the EMD satisfies the triangle inequality, i.e., the EMD is the distance.
The invention also provides a computer-readable storage medium, on which a computer program is stored which, when executed, implements the method of any one of claims 1 to 8.
The invention also provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method according to any one of claims 1 to 8 when executing the program.
compared with the prior art, the invention has the following beneficial effects:
according to the multithreading visual target tracking method based on STC and block re-detection, an algorithm main body adopts STC target tracking, can process the general target detection condition, meanwhile, the re-detection strategy of LCT is used for reference, and the re-detection is carried out on the detection failure problem caused by the conditions that the target is heavily shielded, rapidly moves, exceeds the visual field and the like; in order to realize simplification and effectiveness of the algorithm, the rectangular target template image is subjected to blocking operation to obtain a plurality of predefined rectangular sub-regions, the integral histogram is adopted to calculate the gray level histogram in the rectangular sub-regions, and the gray level histogram is compared with the gray level histogram of the target template image block in the first frame image to train an SVM classifier, so that a KNN classifier and a random fern classifier adopted in the LCT algorithm are replaced to realize target redetection. The invention can improve the reliability of target tracking and the efficiency of target tracking.
drawings
Fig. 1 is an overall flowchart of a multi-thread visual target tracking method based on STC and block re-detection according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating an STC spatiotemporal context model and a t +1 th frame image tracking according to an embodiment of the present invention;
fig. 3 is a schematic diagram of horizontal blocks and vertical blocks of an image according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1 and fig. 2, an embodiment of the present invention provides a multithread visual target tracking method based on STC and block re-detection, including the following steps:
S1, opening a first frame image of a video or an image sequence, manually setting or determining a tracking target through a target detection algorithm, and simultaneously starting an LCT thread for training an SVM classifier, wherein the trained SVM classifier is used for target re-detection.
and S2, establishing a spatial context model for the first frame image by adopting an STC algorithm, and initializing parameters related to blocking processing and classification training in the LCT.
s3, performing a block dividing operation on the rectangular template region where the tracking target in the first frame image is located, as shown in fig. 3, where the block dividing includes a vertical block and a horizontal block, each sub-region in the vertical block is a rectangular region half-high and one-tenth-high of the template region, and each sub-region in the horizontal block is a rectangular region half-wide and one-tenth-high of the template region, and is divided into forty sub-regions in total. Wherein the number of sub-regions can be adjusted according to the performance requirements of target tracking.
S4, reading the next frame of image, learning from the previous frame of image to obtain a spatial context model, and calculating the context prior of the target neighborhood, wherein the specific processing method comprises the following steps:
setting the current frame image as the t +1 th frame image and the target position in the t frame image as x*learning from the t frame image to obtain a spatial context modelthe specific method comprises the following steps:
The prior probability p (c (z) | o) of the context of the target neighborhood describing the features of the local image region is calculated as follows:
p(c(z)|o)=I(z)wσ(z-x*)
Where z is an arbitrary position in the target neighborhood, c (z) is the context feature set X in the current framec={c(z)=(I(z),z)|z∈Ωc(x*)}(Ωc(x*) Denotes x*local context region of (a), o denotes the presence of a tracking target in the image, I (z) is the image grey value at position z, wσ(. cndot.) is a weight function defined by:
Wherein a is a normalization constant, so that the value of p (c (z) | o) is within the interval [0,1], and σ is a scale parameter;
The target location likelihood probability confidence map formula is as follows:
WhereinFor convolution, x is the position of the target in the current frame, b is the normalization constant, and β is the shape factor. To perform fast convolution calculations using FFT, the above equation is transformed to the frequency domain:
Whereinfor FFT transformation, an element-to-bit multiplication operation. The spatial context model h is obtained from the above equationsc(x) Comprises the following steps:
WhereinIs an inverse FFT transformation.
S5, updating the spatio-temporal context model of the current frame image according to the spatial context model learned from the previous frame image in step S4, with the following formula:
Where p is a learning parameter, where,Is a spatial context model learned from the image of the t-th frame,Is a spatio-temporal context model cumulatively obtained in the first t frames of a video or image sequence;
The above equation can be seen as a temporal filtering process that is easily observed in the frequency domain,
Whereinis thatthe time-domain fourier transform of (a),
Wherein j is an imaginary unit, and ω is frequency;
Is defined byin a similar manner to that described above,time domain filtering FωThen the result is given by the following equation,
wherein Fωis a low pass filter. Therefore, the space-time context model can effectively filter image noise caused by target state change in a video image sequence and enhance the stability of a target tracking algorithm.
s6, calculating the confidence map of the current frame according to the following formula:
whereinAnd representing the target position in the t-th frame, and obtaining the target position in the current frame and the confidence probability thereof by maximizing the confidence map.
s7, judging the degree of the occlusion of the tracking target in the current frame image according to the confidence probability obtained in the step S6: when the confidence probability value is greater than a1, judging that the tracked target has no occlusion, including the condition that the occlusion is light; when the confidence probability value is between a2 and a1, judging that the tracking target is in general occlusion; when the confidence probability value is less than a2, judging that the tracking target is seriously shielded or exceeds the field of vision; the formula is shown in the specification, wherein 1> a1> a2>0, a1 and a2 are determined according to actual conditions, and a1 is 0.75 and a2 is 0.3 in actual application.
s8, selecting a corresponding processing strategy according to the situation that the tracking target is blocked, which is judged in the step S7:
if the tracked target is judged to be not blocked, namely the confidence probability value is larger than 0.75, the current frame image continues to adopt STC for target tracking, and the confidence map c obtained in the step S6 is maximizedt+1(x) To obtain the target position in the t +1 th frame image
that is, the target position of the current frame image obtained by maximizing the confidence map in step S6 is the final target position.
and updating the target scale, wherein the target size may change in the video image sequence, so the weight function wσthe scale parameter σ in (2) should also be updated, and the update strategy of σ is as follows:
wherein c ist(. is a confidence distribution, s'tis the target scale estimated from two consecutive frames of images. To avoid scalingis too sensitive and reduces the noise contribution, s, due to estimation errorst+1by andupdating similar filtering to estimate, whereinIs the average of the scale estimates, λ, of the previous n successive frames of the image>0 is a given filter parameter.
Meanwhile, the LCT updating flag is set to be 1, and the LCT thread is continuously updated. When the confidence probability value is greater than 0.75, because the tracked target is not blocked or is slightly blocked, the current frame and the previous frame have higher similarity, after the STC processes the current image frame t +1, the rectangular region where the tracked target in the current frame image is located is subjected to blocking operation, the specific operation is similar to the blocking operation in the step S3, the integral histogram is adopted to extract the gray histogram of each block, and the number of times of occurrence of each gray value in the block is counted. By calculating the integral histogram of each position in an image, the integral histogram of any rectangular area in the image can be quickly solved, and the quick calculation of the integral histogram is realized. Simultaneously with the above processing, the image HOG features of the region where the target is located are used to train the SVM classifier for target relocation when needed.
if the tracked target is judged to be in a common shield, namely the confidence probability value is between 0.3 and 0.75, continuously updating the target position by adopting STC; because the confidence probability value of the STC is a change process from high to low in adjacent frames in the process from no shielding to general shielding or complete shielding of the target, if the LCT updating mark is still True and the confidence probability value is more than 0.7 at the moment, the target image begins to exceed the STC processing range due to shielding or deformation, the LCT is not updated any more, the LCT updating mark is False, and meanwhile, the LCT estimation mark is True; and when the confidence probability value is less than 0.7, continuously updating the LCT, and performing blocking operation on the rectangular template region where the tracking target is located in the current frame image, wherein the specific operation is similar to the blocking operation in the step S3, the integral histogram is adopted to extract the gray histogram of each sub-rectangular region, and the image HOG characteristic of the region where the target is located is used for training the SVM classifier so as to perform target relocation when needed.
if the tracked target is judged to be seriously shielded or exceed the field of view, namely the confidence probability value is less than 0.3, the target tracking of the STC algorithm fails, at the moment, the condition that the tracked target in the current frame image is shielded is further judged by adopting the blocks, and the specific processing method comprises the following steps:
because the moving range of the target between the adjacent frames is limited, the target position obtained from the image of the previous frame is taken as the center, the searching range with the radius of r is defined, for each position (x, y) in the searching range, a rectangular target area with the (x, y) as the center corresponds to the target area of the previous frame, the blocking operation of the step S3 is adopted, the rectangular target area corresponding to the position (x, y) is blocked, and the gray level histogram of each block is calculated according to the integral histogram of the whole image;
Then, EMD is used to calculate the similarity between the patches, and two patch feature sets P, Q are set as follows:
Wherein M, N respectively represents the number of features in P, Q, pi∈R,qj∈R,(R is a real number set), piDenotes the ith (1. ltoreq. i. ltoreq.M) feature in P, qjRepresents the j (1. ltoreq. j. ltoreq.N) th feature in Q,Represents a feature pithe weight of (a) is determined,Representative of characteristic qjweight of dijRepresents piAnd q isjThe euclidean distance between.
The similarity of two image regions is described by EMD, which is defined as follows:
Wherein f isijsolving the optimization problem by:
s.t
fij≥0,i=1,2,...,M;j=1,2,...,N;
when in useThe EMD satisfies the triangle inequality, and can be used for judging the similarity between the two image block histograms.
and for each block, calculating the EMD similarity of each position block and the block in the search area to form a similarity map with the same size as the search area, and obtaining the EMD value corresponding to the sub-block with the highest similarity to the block in the search area range by minimizing the similarity map (the smaller the EMD value is, the more similar the EMD value is). Sequencing the values corresponding to all the blocks from small to large, and verifying through a large number of experiments to obtain the result, wherein when the fifth value is smaller than the set threshold value, the shielding degree of the target is relatively low, and at the moment, the target position is still updated by adopting STC; otherwise, if the target is seriously shielded or exceeds the visual field, performing probability estimation on whether each image area with the same size as the target image area in the whole image is the target area by adopting an SVM classifier, wherein the image area corresponding to the maximum probability is the latest position of the target, and thus, target relocation is realized.
And S9, circularly executing the steps S4-S8 until the current video or image sequence is processed. Specifically, whether the current video or image sequence is processed or not is judged through the circulation condition, if so, target tracking is finished, otherwise, the next frame of image is extracted, and recalculation is started from the fourth step.
according to the multithreading visual target tracking method based on STC and block re-detection provided by the embodiment of the invention, an algorithm main body adopts STC target tracking, can process the general target detection condition, meanwhile, the re-detection strategy of LCT is used for reference, and the re-detection is carried out on the detection failure problem caused by the conditions that the target is heavily blocked, rapidly moves, exceeds the visual field and the like; in order to realize simplification and effectiveness of the algorithm, the rectangular target template image is subjected to blocking operation to obtain a plurality of predefined rectangular sub-regions, the integral histogram is adopted to calculate the gray level histogram in the rectangular sub-regions, and the gray level histogram is compared with the gray level histogram of the target template image block in the first frame image to train an SVM classifier, so that a KNN classifier and a random fern classifier adopted in the LCT algorithm are replaced to realize target redetection. The invention can improve the reliability of target tracking and the efficiency of target tracking.
those of ordinary skill in the art will appreciate that all or part of the steps of the various methods of the embodiments may be implemented by associated hardware as instructed by a program, which may be stored on a computer-readable storage medium, which may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, or the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. a multithreading visual target tracking method based on STC and block re-detection is characterized by comprising the following steps:
s1, reading a first frame image of the video or image sequence, and determining a tracking target;
s2, establishing a space context model for the first frame image by adopting an STC algorithm;
s3, performing blocking operation on the rectangular template area where the tracking target in the first frame image is located, and training an SVM classifier by adopting blocks of the artificially selected target rectangular area in the first frame image;
s4, reading the next frame of image, and learning from the previous frame of image to obtain a spatial context model;
S5, updating the space-time context model of the current frame image according to the space context model learned from the previous frame image in the step S4;
S6, acquiring a confidence map of the current frame image, and acquiring the target position and the confidence probability of the current frame image by maximizing the confidence map;
S7, judging the occlusion degree of the tracking target in the current frame image according to the confidence probability obtained in the step S6;
S8, selecting a corresponding algorithm to update the target position according to the shielding degree of the tracking target judged in the step S7;
And S9, circularly executing the steps S4-S8 until the current video or image sequence is processed.
2. the STC and block re-detection based multi-threaded visual target tracking method of claim 1, wherein: the partitioning performed on the first frame image in step S3 includes vertical partitioning in which each sub-region is a rectangular region half-height and one-tenth-width of the template region, and horizontal partitioning in which each sub-region is a rectangular region half-width and one-tenth-height of the template region, and is totally divided into forty sub-regions.
3. the STC and block re-detection based multi-threaded visual target tracking method of claim 1, wherein: the step S4 specifically includes the steps of,
Setting the current frame image as t +1 th frame image, and determining the target position x of the previous frame image, i.e. the t th frame image*Learning from the t frame image to obtain a spatial context modelThe specific method comprises the following steps:
the target neighborhood context prior probability is calculated by:
p(c(z)|o)=I(z)wσ(z-x*)
wherein p (c (z) | o) is the prior probability of the context of the target neighborhood and describes the characteristics of a local image region, z is the position of each target neighborhood, and c (z) is the context characteristic set X in the current framec={c(z)=(I(z),z)|z∈Ωc(x*) O is the tracking target appearing in the current scene, I (z) is the image gray value at position z, wσ(. cndot.) is a weight function defined by:
wherein a is a normalization constant, so that the value of p (c (z) | o) is within the interval [0,1], and σ is a scale parameter;
the target location likelihood probability confidence map formula is as follows:
Whereinfor convolution operation, x ∈ R2is the target position, b is the normalized constant, β is the shape parameter, Ωc(x*) Denotes x*on part ofA lower region;
To perform fast convolution calculations using FFT, the above equation is transformed to the frequency domain:
WhereinFor FFT transformation, an element-to-bit multiplication operation;
Deriving a spatial context model h from the above equationsc(x) The calculation formula (c) is as follows:
whereinis an inverse FFT transformation.
4. The STC and block re-detection based multi-threaded visual target tracking method of claim 3, wherein: the step S5 specifically includes the steps of,
Spatial context model learned from t-th frame image by step S4the spatio-temporal context model used for updating the t +1 th frame image is as follows:
Where p is a learning parameter, where,Is a spatial context model learned from the image of the t-th frame,Is a spatio-temporal context model cumulatively obtained in the first t frames of a video or image sequence.
5. the STC and block re-detection based multi-thread visual target tracking method of claim 4, wherein the equation for obtaining the confidence map of the current frame image in step S6 is as follows:
indicating the target location in the t-th frame.
6. The STC and block re-detection based multi-threaded visual target tracking method of claim 1, wherein: in the step S7, when the confidence probability value is greater than a1, it is determined that the tracked target is not occluded; when the confidence probability value is between a2 and a1, judging that the tracking target is in general occlusion; when the confidence probability value is less than a2, judging that the tracking target is seriously shielded or exceeds the field of vision; wherein 1> a1> a2> 0; wherein a1 is 0.75, and a2 is 0.3.
7. The STC and block re-detection based multi-threaded visual target tracking method of claim 6, wherein: in the step S8, (1) if it is determined that the tracked target is not occluded, the current frame image continues to track the target by using the STC, and the confidence map c obtained in the step S6 is maximizedt+1(x) To obtain the target position
That is, the target position of the current frame image obtained by maximizing the confidence map in step S6 is the final target position;
And updating the target scale, wherein the target size may change in the video image sequence, so the weight function wσThe scale parameter σ in (2) should also be updated, and the update strategy of σ is as follows:
Wherein c ist(. is a confidence distribution, s'tis the target scale estimated from two consecutive frames of images,is the average of the scale estimates, λ, of the previous n successive frames of the image>0 is a given filter parameter;
Meanwhile, the rectangular template area where the tracking target is located in the current frame image is subjected to blocking operation, the integral histogram is adopted to extract the gray level histogram of each sub-rectangular area, and the image HOG characteristic of the area where the target is located is used for training an SVM classifier;
(2) If the tracked target is judged to be in general occlusion, continuously adopting STC to update the target position, further performing blocking operation on a rectangular template region where the tracked target is located in the current frame image when the confidence probability value is less than 0.7, extracting a gray level histogram of each sub-rectangular region by adopting an integral histogram, and training an SVM classifier by using image HOG characteristics of the region where the target is located;
(3) if the tracked target is judged to be seriously shielded or exceed the field of vision, the condition that the tracked target is shielded in the current frame image frame is further judged by adopting the blocks, and the method specifically comprises the following steps:
defining a search range with the radius r by taking the target position obtained by the previous frame of image as the center, wherein for each position (x, y) in the search range, a rectangular target area with the (x, y) as the center corresponds to the target area of the previous frame, adopting blocking operation to rectangularly block the rectangular target area where the position (x, y) is located, and adopting an integral histogram to extract a gray histogram of each block;
calculating the similarity between the blocks by adopting EMD, obtaining a similarity map which is close to the size of a search area for each block, obtaining an EMD value corresponding to a sub-block with the highest similarity with the current sub-area block in the range of the search area by minimizing the similarity map, sequencing the values corresponding to all the blocks from small to large, updating the target position by adopting STC when the fifth value is smaller than a set threshold value and the shielded degree of the target is lighter, otherwise, scoring by adopting an SVM classifier when the target is seriously shielded or exceeds the visual field, and repositioning the target position by maximizing the scoring map.
8. The STC and block re-detection based multi-threaded visual target tracking method of claim 7, wherein: EMD is defined as follows:
Setting:
Wherein:
pi∈R,qj∈R,
pifeatures representing a picture, qja feature representing another picture is shown in the figure,Represents a feature piThe weight of (a) is determined,representative of characteristic qjWeight of dijRepresents piand q isjThe distance between them;
the problem is optimized by:
s.t
fij≥0,i=1,2,...,M;j=1,2,...,N;
m, N show the number of combinations of features and weights in P, Q sets, respectively, to solve for fijthe definition of EMD is:
when:
When the histogram is similar to the histogram, the EMD satisfies the triangle inequality, i.e., the EMD is the distance.
9. A computer-readable storage medium having stored thereon a computer program, characterized in that: the program when executed implements the method of any one of claims 1 to 8.
10. A computer device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein: the processor, when executing the program, implements the method of any one of claims 1 to 8.
CN201910716977.6A 2019-08-05 2019-08-05 Multithreading visual target tracking method based on STC and block re-detection Active CN110570451B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910716977.6A CN110570451B (en) 2019-08-05 2019-08-05 Multithreading visual target tracking method based on STC and block re-detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910716977.6A CN110570451B (en) 2019-08-05 2019-08-05 Multithreading visual target tracking method based on STC and block re-detection

Publications (2)

Publication Number Publication Date
CN110570451A true CN110570451A (en) 2019-12-13
CN110570451B CN110570451B (en) 2022-02-01

Family

ID=68774611

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910716977.6A Active CN110570451B (en) 2019-08-05 2019-08-05 Multithreading visual target tracking method based on STC and block re-detection

Country Status (1)

Country Link
CN (1) CN110570451B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112053385A (en) * 2020-08-28 2020-12-08 西安电子科技大学 Remote sensing video shielding target tracking method based on deep reinforcement learning
CN112489085A (en) * 2020-12-11 2021-03-12 北京澎思科技有限公司 Target tracking method, target tracking device, electronic device, and storage medium
CN112541431A (en) * 2020-12-10 2021-03-23 中国科学院自动化研究所 High-resolution image target detection method and system
CN113034378A (en) * 2020-12-30 2021-06-25 香港理工大学深圳研究院 Method for distinguishing electric automobile from fuel automobile
CN113129333A (en) * 2020-01-16 2021-07-16 舜宇光学(浙江)研究院有限公司 Multi-target real-time tracking method and system and electronic equipment
CN113223054A (en) * 2021-05-28 2021-08-06 武汉卓目科技有限公司 Target tracking method and device for improving jitter property of ECO (equal cost offset) tracking frame
CN115712354A (en) * 2022-07-06 2023-02-24 陈伟 Man-machine interaction system based on vision and algorithm
CN116993785A (en) * 2023-08-31 2023-11-03 东之乔科技有限公司 Target object visual tracking method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105631895A (en) * 2015-12-18 2016-06-01 重庆大学 Temporal-spatial context video target tracking method combining particle filtering
CN106875415A (en) * 2016-12-29 2017-06-20 北京理工雷科电子信息技术有限公司 The continuous-stable tracking of small and weak moving-target in a kind of dynamic background
CN107424175A (en) * 2017-07-20 2017-12-01 西安电子科技大学 A kind of method for tracking target of combination spatio-temporal context information
US20180046857A1 (en) * 2016-08-12 2018-02-15 Qualcomm Incorporated Methods and systems of updating motion models for object trackers in video analytics
CN108022254A (en) * 2017-11-09 2018-05-11 华南理工大学 A kind of space-time contextual target tracking based on sign point auxiliary

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105631895A (en) * 2015-12-18 2016-06-01 重庆大学 Temporal-spatial context video target tracking method combining particle filtering
US20180046857A1 (en) * 2016-08-12 2018-02-15 Qualcomm Incorporated Methods and systems of updating motion models for object trackers in video analytics
CN106875415A (en) * 2016-12-29 2017-06-20 北京理工雷科电子信息技术有限公司 The continuous-stable tracking of small and weak moving-target in a kind of dynamic background
CN107424175A (en) * 2017-07-20 2017-12-01 西安电子科技大学 A kind of method for tracking target of combination spatio-temporal context information
CN108022254A (en) * 2017-11-09 2018-05-11 华南理工大学 A kind of space-time contextual target tracking based on sign point auxiliary

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
QIUCHEN LI 等: "A framework of object tracking based on STC algorithm", 《2017 FIRST INTERNATIONAL CONFERENCE ON ELECTRONICS INSTRUMENTATION & INFORMATION SYSTEMS (EIIS)》 *
柳培忠 等: "一种结合时空上下文的在线卷积网络跟踪算法", 《计算机研究与发展》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113129333A (en) * 2020-01-16 2021-07-16 舜宇光学(浙江)研究院有限公司 Multi-target real-time tracking method and system and electronic equipment
CN112053385A (en) * 2020-08-28 2020-12-08 西安电子科技大学 Remote sensing video shielding target tracking method based on deep reinforcement learning
CN112053385B (en) * 2020-08-28 2023-06-02 西安电子科技大学 Remote sensing video shielding target tracking method based on deep reinforcement learning
CN112541431A (en) * 2020-12-10 2021-03-23 中国科学院自动化研究所 High-resolution image target detection method and system
CN112489085A (en) * 2020-12-11 2021-03-12 北京澎思科技有限公司 Target tracking method, target tracking device, electronic device, and storage medium
CN113034378A (en) * 2020-12-30 2021-06-25 香港理工大学深圳研究院 Method for distinguishing electric automobile from fuel automobile
CN113034378B (en) * 2020-12-30 2022-12-27 香港理工大学深圳研究院 Method for distinguishing electric automobile from fuel automobile
CN113223054A (en) * 2021-05-28 2021-08-06 武汉卓目科技有限公司 Target tracking method and device for improving jitter property of ECO (equal cost offset) tracking frame
CN115712354A (en) * 2022-07-06 2023-02-24 陈伟 Man-machine interaction system based on vision and algorithm
CN116993785A (en) * 2023-08-31 2023-11-03 东之乔科技有限公司 Target object visual tracking method and device, electronic equipment and storage medium
CN116993785B (en) * 2023-08-31 2024-02-02 东之乔科技有限公司 Target object visual tracking method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN110570451B (en) 2022-02-01

Similar Documents

Publication Publication Date Title
CN110570451B (en) Multithreading visual target tracking method based on STC and block re-detection
Aker et al. Using deep networks for drone detection
US20210248378A1 (en) Spatiotemporal action detection method
CN110363047B (en) Face recognition method and device, electronic equipment and storage medium
CN110728697B (en) Infrared dim target detection tracking method based on convolutional neural network
US8111873B2 (en) Method for tracking objects in a scene
CN103514441B (en) Facial feature point locating tracking method based on mobile platform
US20120076361A1 (en) Object detection device
US10896495B2 (en) Method for detecting and tracking target object, target object tracking apparatus, and computer-program product
CN109035295B (en) Multi-target tracking method, device, computer equipment and storage medium
CN103824070A (en) Rapid pedestrian detection method based on computer vision
WO2011067790A2 (en) Cost-effective system and method for detecting, classifying and tracking the pedestrian using near infrared camera
CN112926410A (en) Target tracking method and device, storage medium and intelligent video system
CN115240130A (en) Pedestrian multi-target tracking method and device and computer readable storage medium
CN105303571A (en) Time-space saliency detection method for video processing
CN114708300A (en) Anti-blocking self-adaptive target tracking method and system
CN112613565B (en) Anti-occlusion tracking method based on multi-feature fusion and adaptive learning rate updating
JP7488674B2 (en) OBJECT RECOGNITION DEVICE, OBJECT RECOGNITION METHOD, AND OBJECT RECOGNITION PROGRAM
CN111428567A (en) Pedestrian tracking system and method based on affine multi-task regression
JP6851246B2 (en) Object detector
CN114639117A (en) Cross-border specific pedestrian tracking method and device
De Padua et al. Particle filter-based predictive tracking of futsal players from a single stationary camera
CN106909934B (en) Target tracking method and device based on self-adaptive search
CN112489085A (en) Target tracking method, target tracking device, electronic device, and storage medium
Basit et al. Fast target redetection for CAMSHIFT using back-projection and histogram matching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant