CN114897938A - Improved cosine window related filtering target tracking method - Google Patents

Improved cosine window related filtering target tracking method Download PDF

Info

Publication number
CN114897938A
CN114897938A CN202210575210.8A CN202210575210A CN114897938A CN 114897938 A CN114897938 A CN 114897938A CN 202210575210 A CN202210575210 A CN 202210575210A CN 114897938 A CN114897938 A CN 114897938A
Authority
CN
China
Prior art keywords
target
cosine window
image
scale
filter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210575210.8A
Other languages
Chinese (zh)
Inventor
孙家伟
邓丽珍
朱虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202210575210.8A priority Critical patent/CN114897938A/en
Publication of CN114897938A publication Critical patent/CN114897938A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/248Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/262Analysis of motion using transform domain methods, e.g. Fourier domain methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

A relevant filtering target tracking method for improving a cosine window provides improvement of a scale self-adaptive cosine window, peak clipping processing is carried out on the cosine window by using the initial size of a target, in addition, scale change of the target is considered, and dynamic adjustment is carried out on the peak clipping position of the cosine window through a scale change factor, so that the tracking success rate of a tracking algorithm in a target scale change scene is improved; the ADTrack algorithm is improved, self-constraint and mutual constraint between a training target filter and a background filter are introduced, an objective function is further optimized, tracking drift and template rapid degradation phenomena of an existing tracker are optimized, and ADMM optimization derivation is performed on a new objective function.

Description

Improved cosine window related filtering target tracking method
Technical Field
The invention belongs to the technical field of target tracking, and particularly relates to a correlation filtering target tracking method for an improved cosine window.
Background
The image recognition and target tracking are key works in computer vision, the specific content of the image recognition and target tracking is continuous inference processing facing to a target state in a video sequence, the core work is to locate a tracking target related to a video in all frames to obtain a motion track of the corresponding target, and an active region of the tracking target is provided for each frame of image. The target tracking technology is generally popularized by military and civil fields at present, great convenience is brought to life of people, and the requirement on target tracking is higher and higher along with the improvement of the living standard of human beings. Although the related art of target tracking is revolutionized every year, the related algorithms still have a need for performance optimization, and researchers still face many challenges at present.
In recent years, many researchers have proposed various methods to design a stable and accurate target tracker for various problems of target tracking. In 2017, Galoogahi et al propose a BACF algorithm, and propose a key 'cutting' idea for improving the quality of a sample, expand an original sampling area, obtain more background information, and then perform center cutting on a circulating sample, so that the number of samples is expanded, and the quality of the sample is improved. Then, the ADTrack algorithm proposed by Li et al in 2021 further improves the BACF algorithm by extracting target information using a 0-1 mask and then training the target and target background information separately.
However, ADTrack also has a number of deficiencies. Firstly, a cosine window function is adopted, a model with zero values around a middle extreme value is directly adopted in the common cosine window function, the actual size of a target is ignored, and if a sample is directly preprocessed, the texture of the target is increased to pollute target source information. In addition, the ADTrack algorithm is not ideal for the constraint term of the filter, and the over-fitting problem of the filter and the rapid degradation of the filter model exist.
Disclosure of Invention
The invention relates to a method for tracking related filtering targets of an improved cosine window, and filtering training is respectively carried out on the targets and background information. At present, the target tracking field has many problems, such as target scale change, target occlusion and the like. Aiming at the change of the target scale, the invention constructs a scale self-adaptive cosine window, performs peak clipping on the cosine window, updates the cosine window in time, uses two filters of a target and a target background, combines self-constraint and mutual constraint of the two filters, and improves a related filtering target tracking model so as to improve the tracking performance.
The technical scheme adopted by the invention is roughly as follows:
a method for tracking related filtering targets of an improved cosine window comprises the following steps:
step1, a pretreatment stage; judging whether the filter is in a dark scene or not, if so, enhancing the video sequence, and calculating to obtain a mask m for training a subsequent filter;
step2, extracting characteristics; obtaining a feature set x based on a training sample g Self-adaptation of target scale updating is realized through the cosine window after peak clipping processing, and a target feature set x is obtained based on a mask m o Obtaining a scale self-adaptive cosine window model;
step3, training; using the feature set x obtained in step2 g And x o Train out two filters h of next frame g And h o Wherein h is g Is a filter for training background information, h o Is a filter for training target information;
step4, a detection stage; and obtaining a target response diagram based on the trained filter and the extracted sample channel characteristics, and determining the position information of the target through the maximum value of the target response diagram.
Further, step1 comprises the following substeps:
step 1-1, assuming a given color image
Figure BDA0003661864230000031
Performing photometric integration on RBG three channels of an image:
Figure BDA0003661864230000032
therein Ψ m (I (x, y)) represents the pixel value of the m-channel out-image, and α RGB Converting the color image into a single-channel image as 1; and then carrying out logarithmic average processing on the brightness:
Figure BDA0003661864230000033
where δ is a constant for preventing lThe occurrence of the (0) of the (g),
Figure BDA0003661864230000034
representing the logarithmic mean scene brightness of the current image, judging whether the current image is in a dark scene by introducing a threshold value tau,
Figure BDA0003661864230000035
if the value is less than the threshold value, the dark scene is represented as S (I);
step 1-2, enhancing the image by using the previously acquired image brightness V (x, y, I) and the image logarithmic mean brightness
Figure BDA0003661864230000036
The enhancement matrix for the image global is represented as:
Figure BDA0003661864230000037
wherein V max (I) Representing the maximum value in the image brightness V (x, y, I), and then performing enhancement processing on three channels of the image:
Figure BDA0003661864230000038
wherein I e Representing an enhanced image, Ψ m (I e (x, y)) represents the pixel value of the enhanced image at the m-channel (x, y) location; obtaining enhanced part information of the image from the enhanced image:
E(I)=V(I)-V(I e )
step 1-3, obtaining the average value mu and the standard deviation sigma of E (I) through E (I), thereby obtaining the global mask m g
Figure BDA0003661864230000041
By clipping the matrix P to m g Cutting to obtain the desired mask m ═ m g ⊙P,P∈R w×h For extracting target size information within the sample.
Further, step2 comprises the following sub-steps:
step 2-1, a large number of training samples are obtained through cosine window preprocessing, a cyclic matrix and center clipping, cosine window processing is that a cosine window function is directly point-multiplied on the samples, and the operation of cyclic matrix and center clipping is shown in an attached figure 2. Extracting the characteristics of the obtained sample, including gray information, color information, gradient information and the like, to obtain a characteristic set x g
Step 2-2, performing peak clipping processing on the cosine window by using the initial size of the target, wherein the peak clipping position is a parameter Q which belongs to (0,1), and Q is obtained by the following formula:
Figure BDA0003661864230000042
Figure BDA0003661864230000043
wherein cosWin 0 The method is characterized in that the method is the most original cosine window, W is multiplied by H and is the size of the cosine window, W is multiplied by H, and W is not less than H and is the target size;
step 2-3, after the scale of the tracking target is updated every time, obtaining a scale updating factor S scale Updating the peak clipping position Q of the cosine window again scale =Q×S scale Adapting the model scale; then using a mask m to obtain x o =m⊙x g Representing a simple target feature, yielding x g And x o Two feature sets; the concrete model of the scale self-adaptive cosine window is as follows:
Figure BDA0003661864230000044
cosWin in the above formula 0 Is the most primitive cosine window, Q scale Is the position of the cosine window peak clipping, S scale Is a scale factor.
Further, step3 comprises the following sub-steps:
step 3-1, the objective function of the filter is:
Figure BDA0003661864230000051
wherein P is a sample clipping matrix;
Figure BDA0003661864230000052
representing the target information filter or the background information filter of the c channel; cosWin is a cosine window function which is proposed in the prior art and changes along with a target scale factor, and is used for preprocessing training sample data; y is an ideal Gaussian model; h is t And h t-1 Representing filters of a current frame and a previous frame, wherein an M matrix represents the relation between the two filters, lambda is a constraint parameter of a regularization item of the filter, and mu is a constraint parameter of a mutual constraint item of the two filters;
step 3-2, for the whole objective function, because k is formed by the element of { g, o }, and the last two are mutually constrained, the objective function can be regarded as 7 parts of accumulation, the 1 st, 2 nd, 4 th and 5 th parts are conventional linear models with more cutting matrixes, least squares are added with a regular term, and the purpose of the regular term is to prevent the filter from being over-fitted; parts 3 and 6 are self-constraints of two filters, so that rapid degradation of the filters can be effectively prevented; the 7 th part is the mutual constraint of the two filters, and the two filters are bound with each other during training, so that the discrimination capability of the two filters is stronger;
3-3, solving through an ADMM iterative algorithm; since cosWin is a pre-processing of the samples, it can be ignored when iterating. For known h o And M, to h g Performing ADMM iterative optimal solution; introduction of relaxation variables using the augmented Lagrangian method
Figure BDA0003661864230000053
P T Is a transposition of the clipping matrix P, I N Is an N identity matrix; augmented Lagrange form table for objective functionShown as follows:
Figure BDA0003661864230000061
wherein
Figure BDA0003661864230000062
Lagrange vectors, gamma a penalty factor; the ADMM method is adopted to convert the above formula into the following three subproblems in an iteration mode:
Figure BDA0003661864230000063
for
Figure BDA0003661864230000064
Subproblems, finding h by first derivative g Closed-form solution of (c):
Figure BDA0003661864230000065
for xi * The subproblem, which needs to be converted into the frequency domain for further calculation:
Figure BDA0003661864230000066
the above equation is decomposed into T subproblems, where T-42 represents the dimension of the feature, and each subproblem is set as
Figure BDA0003661864230000067
Obtaining:
Figure BDA0003661864230000068
the above formula is derived:
Figure BDA0003661864230000071
optimizing and solving an inverse matrix by using a Sherman-Morrison equation to obtain:
Figure BDA0003661864230000072
wherein
Figure BDA0003661864230000073
Is a scalar;
h g and h o The iteration process of (2) is the same, and the iteration of the M matrix is as follows:
Figure BDA0003661864230000074
further, in step4, the target response map
Figure BDA0003661864230000075
Expressed as:
Figure BDA0003661864230000076
wherein,
Figure BDA0003661864230000077
a corresponding quantity in the fourier domain representing given data,
Figure BDA0003661864230000078
representing the response after an inverse fourier transform, D represents the dimension of the filter,
Figure BDA0003661864230000079
and
Figure BDA00036618642300000710
are the two filters of the f-th frame,
Figure BDA00036618642300000711
represents the c-th channel feature of the search area sample extracted from the f +1 frame,
Figure BDA00036618642300000712
is subjected to a masking treatment
Figure BDA00036618642300000713
ρ is a control group
Figure BDA00036618642300000714
And
Figure BDA00036618642300000715
the weight parameters of the two response maps generated.
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention discloses a scale-adaptive cosine window. And analyzing the defects of the existing cosine window, and performing peak clipping processing on the cosine window by using the initial size of the target. In addition, the scale change of the target is considered, the peak clipping position of the cosine window is dynamically adjusted through the scale change factor, and the tracking success rate of the tracking algorithm in the scene of the scale change of the target is improved.
(2) The ADTrack algorithm is improved, self-constraint and mutual constraint between a training target filter and a background filter are introduced, an objective function is further optimized, tracking drift and template rapid degradation phenomena of an existing tracker are optimized, and ADMM optimization derivation is performed on a new objective function.
Drawings
Fig. 1 illustrates the construction of a scale-adaptive cosine window in an embodiment of the present invention.
Fig. 2 is an illustration of a circulant matrix and a clipping matrix.
FIG. 3 is an algorithm model overview framework in an embodiment of the invention.
Fig. 4 is a tracking result of a TC128 data set tiger1 video sequence in an embodiment of the present invention.
FIG. 5 is a graphical illustration of the overall accuracy results of a TC 128-based data set versus a recent algorithm and method of the present invention in an embodiment of the present invention.
FIG. 6 is a graphical illustration of the total power results based on the TC128 data set versus the algorithm in recent years and the method of the present invention in an embodiment of the present invention.
Detailed Description
The technical scheme of the invention is further explained in detail by combining the drawings in the specification.
The invention relates to a method for tracking related filtering targets of an improved cosine window, and filtering training is respectively carried out on the targets and background information. At present, the target tracking field has many problems, such as target scale change, target occlusion and the like. Aiming at the change of the target scale, the invention constructs a scale self-adaptive cosine window, performs peak clipping on the cosine window, updates the cosine window in time, uses two filters of a target and a target background, combines self-constraint and mutual constraint of the two filters, and improves a related filtering target tracking model so as to improve the tracking performance.
The technical scheme of the invention mainly comprises the following contents: aiming at the problems that the cosine window in the related filtering target tracking algorithm can increase the texture of a sample and pollute the sample, a scale self-adaptive cosine window function is constructed, the construction method comprises the steps of firstly carrying out peak clipping on a cosine window model based on the reference size of a target, and then utilizing a scale factor S in a DSST algorithm scale detection module scale And updating the cosine window model of peak clipping at proper time to make the scale self-adaptive, and the specific effect is shown in figure 1. The general tracking model of the invention is based on the ADTrack algorithm, the specific effect is shown in figure 3, besides the scale self-adaptive cosine window, the self-constraint of the filter is optimized on the basis of the ADTrack basic model
Figure BDA0003661864230000091
And are mutually constrained
Figure BDA0003661864230000092
And the ADMM iteration of the model is re-derived.
The method comprises the following specific steps:
step 1: in the preprocessing stage, a threshold value tau is used for judging whether the scene is dark, and the optimal value of tau is obtained by adjusting parameters. And if the video sequence is judged to be in a dark scene, enhancing the video sequence to improve the accuracy and robustness of target tracking in a night state, and obtaining a mask m for training two subsequent filters.
Assuming a given color image
Figure BDA0003661864230000093
Performing photometric integration on RBG three channels of an image:
Figure BDA0003661864230000094
therein Ψ m (I (x, y)) represents the pixel value of the m-channel out-image, and α RG +α B 1, it can be understood that a color image is converted into a single-channel image. And then carrying out logarithmic average processing on the brightness:
Figure BDA0003661864230000095
where δ is a small value intended to prevent log (0) error conditions, this
Figure BDA0003661864230000096
Representing the logarithmic mean scene brightness of the current image, and then judging whether the current image is in a dark scene by introducing a threshold value tau,
Figure BDA0003661864230000097
a dark scene is defined as a dark scene, denoted as s (i).
And then enhancing the image by using the brightness V (x, y, I) of the image and the logarithmic average brightness of the image acquired before
Figure BDA0003661864230000098
The enhancement matrix for the image global can be expressed as:
Figure BDA0003661864230000101
wherein V max (I) Representing the maximum in the image brightness V (x, y, I), the three channels of the image can then be enhanced:
Figure BDA0003661864230000102
wherein I e Representing an enhanced image, Ψ m (I e (x, y)) also represents the pixel value of the enhanced image at the m-channel (x, y) position. From the enhanced image, the enhanced part information of the image is easily obtained:
E(I)=V(I)-V(I e )
finally, the average value mu and the standard deviation sigma of E and I are obtained through E and I, so that the global mask m can be obtained g
Figure BDA0003661864230000103
Referring to the clipping matrix P in the BACF algorithm, for m g Cutting to obtain the desired mask m ═ m g ⊙P,P∈R w×h For extracting target size information within the sample.
Step 2: and (5) extracting characteristics. A large number of training samples are obtained by cosine window preprocessing, a cyclic matrix and center clipping, cosine window processing is that a cosine window function is directly point-multiplied on the samples, and the operation of cyclic matrix and center clipping is shown in an attached figure 2. Then, extracting the characteristics of the obtained sample, including gray information, color information, gradient information and the like, to obtain a characteristic set x g . In consideration of the defects of the original cosine window, the cosine window is improved by the method, the specific steps are shown in the attached figure 1, the peak clipping of the cosine window is firstly carried out by utilizing the initial size of a target, the peak clipping position is a parameter Q epsilon (0,1), and Q can be obtained by the following formula:
Figure BDA0003661864230000111
Figure BDA0003661864230000112
wherein cosWin 0 The cosine window is the most original cosine window, W is multiplied by H and is the size of the cosine window, and W is multiplied by H and is not less than H and is the target size.
Then, after each time of target scale updating, a scale updating factor S is obtained scale Updating the peak clipping position Q of the cosine window again scale =Q×S scale The model scale is adapted. Then using a mask m to obtain x o =m⊙x g Representing the simple target feature, so far x is obtained g And x o Two feature sets. The concrete model of the scale self-adaptive cosine window is as follows:
Figure BDA0003661864230000113
cosWin in the above formula 0 Is the most primitive cosine window, Q scale Is the position of the cosine window peak clipping, S scale And W is a scale factor, W is multiplied by H and is a cosine window size, and W is not less than H and is a target size.
Step 3: and (5) a training stage. X from Step2 g And x o The feature set trains out two filters for the next frame: h is g And h o ,h g Is a filter for training background information, h o Is a filter used to train the target information. The algorithm model is improved based on an ADTrack model, the concrete model is shown in the attached figure 3, and the target function of the model can be represented as follows:
Figure BDA0003661864230000114
wherein, P is a sample clipping matrix in the ADTrack reference algorithm BACF algorithm; h is k c Representing the target information of the c-th channelOr a background information filter; cosWin is a cosine window function which is proposed in the prior art and changes along with a target scale factor, and is used for preprocessing training sample data; y is an ideal Gaussian model; h is t And h t-1 The filter is used for representing the current frame and the previous frame, the M matrix represents the relation between the two filters, lambda is the constraint parameter of the regularization term of the filter, and mu is the constraint parameter of the mutual constraint term of the two filters.
For the whole objective function, because k ∈ { g, o }, and the last two are mutually constrained, the objective function can be regarded as 7-part accumulation, the 1 st, 2 nd, 4 th and 5 th parts are conventional linear models with more clipping matrixes, and least squares are added with a regular term, wherein the purpose of the regular term is to prevent the filter from being over-fitted; parts 3 and 6 are self-constraints of two filters, so that rapid degradation of the filters can be effectively prevented; part 7 is the mutual constraint of the two filters, and the two filters are bound to each other during training, so that the discrimination capability of the two filters is stronger.
The solution is then performed by an ADMM iterative algorithm. In the iterative process, h can be assumed first o And M is known, for h g And carrying out ADMM iterative optimal solution. Since the W cosine window function is only a pre-processing of the samples, it can be ignored when performing the iteration. Introduction of relaxation variables using the augmented Lagrangian method
Figure BDA0003661864230000121
P T Is a transposition of the clipping matrix P, I N Is an N identity matrix; the augmented lagrange form of the objective function is expressed as:
Figure BDA0003661864230000122
wherein
Figure BDA0003661864230000123
Is the lagrange vector and gamma is the penalty factor. By adopting the ADMM method, the above formula can be converted into the following three subproblems in an iterative manner:
Figure BDA0003661864230000131
for the
Figure BDA0003661864230000132
The subproblem, h can be found directly from the first derivative g Closed-form solution of (c):
Figure BDA0003661864230000133
for subproblem xi * It needs to be converted into frequency domain for further calculation:
Figure BDA0003661864230000134
the above equation can be decomposed into T sub-questions, where T-42 represents the dimension of the feature, and each sub-question is set as
Figure BDA0003661864230000135
The following can be obtained:
Figure BDA0003661864230000136
derivation of the above equation yields:
Figure BDA0003661864230000137
due to the matrix division, the calculation amount is large, and a Sherman-Morrison equation is needed to optimize and solve the inverse matrix, so that:
Figure BDA0003661864230000138
wherein
Figure BDA0003661864230000139
Is a scalar quantity.
h g And h o The iteration process is substantially the same, and is not described again, and the iteration of the M matrix is:
Figure BDA0003661864230000141
step 4: and (5) a detection stage. Target response graph
Figure BDA0003661864230000142
Can be expressed as:
Figure BDA0003661864230000143
wherein,
Figure BDA0003661864230000144
representing the corresponding quantity in the fourier domain of the given data,
Figure BDA0003661864230000145
representing the response after an inverse fourier transform, D corresponds to the dimension of the filter,
Figure BDA0003661864230000146
and
Figure BDA0003661864230000147
are the two filters for the f-th frame,
Figure BDA0003661864230000148
a c-th channel feature representing a search area sample extracted from the f +1 frame,
Figure BDA0003661864230000149
is subjected to a masking treatment
Figure BDA00036618642300001410
ρ is a control group
Figure BDA00036618642300001411
And
Figure BDA00036618642300001412
the weight parameters of the two response maps generated. Finally according to the response diagram
Figure BDA00036618642300001413
The maximum value may determine the location information of the object.
According to the method, the tracking performance of the target tracking algorithm under the scene of scale change is improved through self-constraint and mutual constraint of the scale self-adaptive cosine window and the filter, the problem of template drift in the tracking process is reduced, the accuracy and the success rate of the algorithm are greatly improved, and as shown in the attached figures 5-6 and the table 1, the tracking success rate of the method is ranked first, and the accuracy rate and the AutoTrack algorithm are arranged first in parallel.
TABLE 1 Total accuracy and Total Power for each Algorithm under the TC128 data set
Ours CPCF ADTrack AutoTrack BACF KCF SRDCF BiCF
Precision 0.702 0.697 0.689 0.702 0.644 0.544 0.644 0.641
Success 0.649 0.617 0.619 0.629 0.610 0.454 0.584 0.559
The above description is only a preferred embodiment of the present invention, and the scope of the present invention is not limited to the above embodiment, but equivalent modifications or changes made by those skilled in the art according to the present disclosure should be included in the scope of the present invention as set forth in the appended claims.

Claims (6)

1. A correlation filtering target tracking method for improving a cosine window is characterized by comprising the following steps: the method comprises the following steps:
step1, a pretreatment stage; judging whether the filter is in a dark scene or not, if so, enhancing the video sequence, and calculating to obtain a mask m for training a subsequent filter;
step2, extracting characteristics; obtaining a feature set x based on a training sample g Self-adaptation of target scale updating is realized through the cosine window after peak clipping processing, and a target feature set x is obtained based on a mask m o Obtaining a scale self-adaptive cosine window model;
step3, training; using the feature set x obtained in step2 g And x o Train out two filters h of next frame g And h o Wherein h is g Is a filter for training background information, h o Is a filter for training target information;
step4, a detection stage; and obtaining a target response diagram based on the trained filter and the extracted sample channel characteristics, and determining the position information of the target through the maximum value of the target response diagram.
2. The correlation filtering target tracking method for improving the cosine window as claimed in claim 1, wherein: the step1 comprises the following sub-steps:
step 1-1, assuming a given color image
Figure FDA0003661864220000011
And (3) performing photometric integration on the RBG three channels of the image:
Figure FDA0003661864220000012
therein Ψ m (I (x, y)) represents the pixel value of the m-channel out-image, and α RGB Converting the color image into a single-channel image as 1; and then carrying out logarithmic average processing on the brightness:
Figure FDA0003661864220000013
where delta is a constant, to prevent the occurrence of log (0),
Figure FDA0003661864220000014
representing the logarithmic mean scene brightness of the current image, judging whether the current image is in a dark scene by introducing a threshold value tau,
Figure FDA0003661864220000015
if the value is less than the threshold value, the dark scene is represented as S (I);
step 1-2, enhancing the image by using the previously acquired image brightness V (x, y, I) and the image logarithmic mean brightness
Figure FDA0003661864220000021
The enhancement matrix for the image global is represented as:
Figure FDA0003661864220000022
wherein V max (I) Representing the maximum value in the image brightness V (x, y, I), and then performing enhancement processing on three channels of the image:
Figure FDA0003661864220000023
wherein I e Representing an enhanced image, Ψ m (I e (x, y)) represents the pixel value of the enhanced image at the m-channel (x, y) location; obtaining enhanced part information of the image from the enhanced image:
E(I)=V(I)-V(I e )
step 1-3, obtaining the average value mu and the standard deviation sigma of E (I) through E (I), thereby obtaining the global mask m g
Figure FDA0003661864220000024
By clipping the matrix P to m g To carry outCutting to obtain the desired mask m ═ m g ⊙P,P∈R w×h For extracting target size information within the sample.
3. The correlation filtering target tracking method for improving the cosine window as claimed in claim 1, wherein: the step2 comprises the following sub-steps:
step 2-1, a large number of training samples are obtained through cosine window preprocessing, a cyclic matrix and center cutting methods, cosine window processing is that cosine window functions are directly point-multiplied on the samples, feature extraction is carried out on the obtained samples, the feature extraction includes gray information, color information, gradient information and the like, and a feature set x is obtained g
Step 2-2, carrying out peak clipping processing on the cosine window by using the target initial size, wherein the peak clipping position is a parameter Q epsilon (0,1), and Q is obtained by the following formula:
Figure FDA0003661864220000031
Figure FDA0003661864220000032
wherein cosWin 0 The method is characterized in that the method is the most original cosine window, W is multiplied by H and is the size of the cosine window, W is multiplied by H, and W is not less than H and is the target size;
step 2-3, after the scale of the tracking target is updated every time, obtaining a scale updating factor S scale Updating the peak clipping position Q of the cosine window again scale =Q×S scale Adapting the model scale; then using a mask m to obtain x o =m⊙x g Representing a simple target feature, yielding x g And x o Two feature sets; the concrete model of the scale self-adaptive cosine window is as follows:
Figure FDA0003661864220000033
cosWin in the above formula 0 Is the most primitive cosine window, Q scale Is the position of the cosine window peak clipping, S scale Is a scale factor.
4. The correlation filtering target tracking method for improving the cosine window as claimed in claim 1, wherein: the step3 comprises the following sub-steps:
step 3-1, the objective function of the filter is:
Figure FDA0003661864220000034
wherein P is a sample clipping matrix;
Figure FDA0003661864220000035
representing the target information filter or the background information filter of the c channel; cosWin is a cosine window function which is proposed in the prior art and changes along with a target scale factor, and is used for preprocessing training sample data; y is an ideal Gaussian model; h is t And h t-1 Representing filters of a current frame and a previous frame, wherein an M matrix represents the relation between the two filters, lambda is a constraint parameter of a regularization item of the filter, and mu is a constraint parameter of a mutual constraint item of the two filters;
step 3-2, for the whole objective function, because k belongs to { g, o }, and the last two are mutually constrained, the objective function is regarded as 7 parts of accumulation, the 1 st, 2 nd, 4 th and 5 th parts are conventional linear models with more cutting matrixes, and least square is added with a regular term to prevent the filter from being over-fitted; parts 3 and 6 are self-constraints of both filters, preventing rapid degradation of the filters; the 7 th part is the mutual constraint of the two filters, and the two filters are bound with each other during training, so that the discrimination capability of the two filters is stronger;
3-3, solving through an ADMM iterative algorithm; since cosWin is a pre-processing of the samples, it can be ignored when iterating. For known h o And M, to h g Performing ADMM iterative optimal solution; use ofAugmented Lagrange method, introducing relaxation variables
Figure FDA0003661864220000041
P T Is a transposition of the clipping matrix P, I N Is an N × N identity matrix.
5. The correlation filtering target tracking method for improving the cosine window as claimed in claim 4, wherein: in step 3-3, the augmented Lagrangian form of the objective function is expressed as:
Figure FDA0003661864220000042
wherein
Figure FDA0003661864220000043
Is the Lagrange vector, gamma is a penalty factor, I N Is an NxN identity matrix, F N Is an N × N fourier matrix; the ADMM method is adopted to convert the above formula into the following three subproblems in an iteration mode:
Figure FDA0003661864220000051
for the
Figure FDA0003661864220000052
Subproblems, finding h by first derivative g Closed-form solution of (c):
Figure FDA0003661864220000053
for xi * The subproblem, which needs to be converted into the frequency domain for further calculation:
Figure FDA0003661864220000054
the above equation is decomposed into T subproblems, where T-42 represents the dimension of the feature, and each subproblem is set as
Figure FDA0003661864220000055
Obtaining:
Figure FDA0003661864220000056
the above formula is derived:
Figure FDA0003661864220000057
optimizing and solving an inverse matrix by using a Sherman-Morrison equation to obtain:
Figure FDA0003661864220000058
wherein
Figure FDA0003661864220000059
Is a scalar;
h g and h o The iteration process of (2) is the same, and the iteration of the M matrix is as follows:
Figure FDA0003661864220000061
6. the correlation filtering target tracking method for improving the cosine window as claimed in claim 1, wherein: in step4, the target response graph
Figure FDA0003661864220000062
Expressed as:
Figure FDA0003661864220000063
wherein,
Figure FDA0003661864220000064
a corresponding quantity in the fourier domain representing given data,
Figure FDA0003661864220000065
representing the response after an inverse fourier transform, D represents the dimension of the filter,
Figure FDA0003661864220000066
and
Figure FDA0003661864220000067
are the two filters of the f-th frame,
Figure FDA0003661864220000068
represents the c-th channel feature of the search area sample extracted from the f +1 frame,
Figure FDA0003661864220000069
is subjected to a masking treatment
Figure FDA00036618642200000610
ρ is a control group
Figure FDA00036618642200000611
And
Figure FDA00036618642200000612
the weight parameters of the two response maps generated.
CN202210575210.8A 2022-05-25 2022-05-25 Improved cosine window related filtering target tracking method Pending CN114897938A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210575210.8A CN114897938A (en) 2022-05-25 2022-05-25 Improved cosine window related filtering target tracking method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210575210.8A CN114897938A (en) 2022-05-25 2022-05-25 Improved cosine window related filtering target tracking method

Publications (1)

Publication Number Publication Date
CN114897938A true CN114897938A (en) 2022-08-12

Family

ID=82725509

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210575210.8A Pending CN114897938A (en) 2022-05-25 2022-05-25 Improved cosine window related filtering target tracking method

Country Status (1)

Country Link
CN (1) CN114897938A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102496016A (en) * 2011-11-22 2012-06-13 武汉大学 Infrared target detection method based on space-time cooperation framework
CN111951298A (en) * 2020-06-25 2020-11-17 湖南大学 Target tracking method fusing time series information
US20210227132A1 (en) * 2018-05-30 2021-07-22 Arashi Vision Inc. Method for tracking target in panoramic video, and panoramic camera

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102496016A (en) * 2011-11-22 2012-06-13 武汉大学 Infrared target detection method based on space-time cooperation framework
US20210227132A1 (en) * 2018-05-30 2021-07-22 Arashi Vision Inc. Method for tracking target in panoramic video, and panoramic camera
CN111951298A (en) * 2020-06-25 2020-11-17 湖南大学 Target tracking method fusing time series information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王永雄等: "基于背景感知与快速尺寸判别的相关滤波跟踪算法", 数据采集与处理, no. 02, 15 March 2020 (2020-03-15) *

Similar Documents

Publication Publication Date Title
CN109389608B (en) There is the fuzzy clustering image partition method of noise immunity using plane as cluster centre
CN111489364B (en) Medical image segmentation method based on lightweight full convolution neural network
CN109410247A (en) A kind of video tracking algorithm of multi-template and adaptive features select
CN107657625A (en) Merge the unsupervised methods of video segmentation that space-time multiple features represent
CN112329784A (en) Correlation filtering tracking method based on space-time perception and multimodal response
CN110322445A (en) A kind of semantic segmentation method based on maximization prediction and impairment correlations function between label
CN110853064B (en) Image collaborative segmentation method based on minimum fuzzy divergence
CN114092353A (en) Infrared image enhancement method based on weighted guided filtering
CN112001294A (en) YOLACT + + based vehicle body surface damage detection and mask generation method and storage device
CN111310609A (en) Video target detection method based on time sequence information and local feature similarity
KR101018299B1 (en) Apparatus and method for detecting a plurality of objects in an image
CN110991554B (en) Improved PCA (principal component analysis) -based deep network image classification method
CN111126169B (en) Face recognition method and system based on orthogonalization graph regular nonnegative matrix factorization
CN115797205A (en) Unsupervised single image enhancement method and system based on Retinex fractional order variation network
CN113947732B (en) Aerial visual angle crowd counting method based on reinforcement learning image brightness adjustment
CN116433909A (en) Similarity weighted multi-teacher network model-based semi-supervised image semantic segmentation method
CN116110113A (en) Iris recognition method based on deep learning
CN104952071A (en) Maximum between-cluster variance image segmentation algorithm based on GLSC (gray-level spatial correlation)
CN116543162B (en) Image segmentation method and system based on feature difference and context awareness consistency
CN117649694A (en) Face detection method, system and device based on image enhancement
CN112528077A (en) Video face retrieval method and system based on video embedding
CN109543684B (en) Real-time target tracking detection method and system based on full convolution neural network
CN115690704B (en) LG-CenterNet model-based complex road scene target detection method and device
CN114897938A (en) Improved cosine window related filtering target tracking method
CN116884067A (en) Micro-expression recognition method based on improved implicit semantic data enhancement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Deng Lizhen

Inventor after: Sun Jiawei

Inventor after: Zhu Hu

Inventor before: Sun Jiawei

Inventor before: Deng Lizhen

Inventor before: Zhu Hu