CN111915653A - Method for tracking double-station visual target - Google Patents

Method for tracking double-station visual target Download PDF

Info

Publication number
CN111915653A
CN111915653A CN202010823124.5A CN202010823124A CN111915653A CN 111915653 A CN111915653 A CN 111915653A CN 202010823124 A CN202010823124 A CN 202010823124A CN 111915653 A CN111915653 A CN 111915653A
Authority
CN
China
Prior art keywords
target
tracking
filter
image
station
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010823124.5A
Other languages
Chinese (zh)
Other versions
CN111915653B (en
Inventor
王小凌
孙忠海
史文浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Aircraft Industry Group Co Ltd
Original Assignee
Shenyang Aircraft Industry Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Aircraft Industry Group Co Ltd filed Critical Shenyang Aircraft Industry Group Co Ltd
Priority to CN202010823124.5A priority Critical patent/CN111915653B/en
Publication of CN111915653A publication Critical patent/CN111915653A/en
Application granted granted Critical
Publication of CN111915653B publication Critical patent/CN111915653B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/262Analysis of motion using transform domain methods, e.g. Fourier domain methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

A method for tracking a double-station visual target belongs to the technical field of target tracking in computer vision. The method comprises the following steps: acquiring an image area of a tracking target by using target detection; extracting the characteristics of the two images, respectively establishing a discrimination related filtering model, and fusing the models; in the subsequent frame, target tracking is carried out by utilizing the fusion model; and judging a tracking result according to the filter responsivity, carrying out target re-identification and updating the model on line. The target tracking device for realizing the method comprises the following steps: the binocular image acquisition unit, the detection unit, the tracking unit and the servo drive unit.

Description

Method for tracking double-station visual target
Technical Field
The invention relates to the technical field of target tracking in computer vision, in particular to a method and a device for tracking a target by double-station vision.
Background
The visual target tracking is an important research content of computer vision, and has wide application in the fields of video monitoring, security protection, intelligent transportation and the like. Due to the existence of challenging factors such as illumination change, scale change, target deformation, rapid movement, occlusion and the like, designing a real-time, accurate and robust visual tracking system is still a difficult task. Typical visual tracking methods include initialization, feature extraction, motion models, appearance models, and model updates.
The initialization of visual tracking is typically done by manually selecting the target area or automatically locating the target using detection algorithms, in an automated scenario the initial box of tracking is usually given by target detection. The feature extraction not only needs strong distinguishing capability, but also has higher calculation efficiency, and meets the real-time requirement of tracking. The features employed by tracking algorithms to date can be broadly divided into artificial and learned features. The motion model is based on the assumption of continuity of the motion of the object, and may be sampled around the object position of the previous frame as the candidate position of the object, including sliding window, mean shift, particle filter, etc. An appearance model is a selection of an object from a series of candidate locations and can be divided into a generative method and a discriminant method. The model updating is to perform adaptive adjustment on the appearance change of any target, the generating method is updated through a subspace or a template of the target, and the discriminant method is to update the classifier by continuously adding positive and negative samples.
Visual tracking algorithms have gained significant development over the last decade. Article
"Tracking-learning-detection IEEE Transactions on Pattern Analysis and Machine Analysis, 2012,34(7): 1409-. The article "High-speed tracking with kernel compensated filtering filters, IEEE Transactions on Pattern Analysis and Machine Analysis 2015,37(3): 583-.
However, these advanced trackers are prone to failure when the target pose changes dramatically. Meanwhile, a visual tracking system based on a single station has a visual field blind area, and cannot complete the whole-course tracking task of the moving target.
Disclosure of Invention
In view of the above problems, the present invention designs a method for two-station visual tracking and a hardware device thereof to realize real-time visual tracking of a target. By adopting the method, the problem of tracking the blind area of the view field at a single station can be solved, and the tracking capability of the object with the rapidly changed posture is improved, so that the target can be continuously tracked.
The technical scheme of the invention is as follows:
a method of dual station visual target tracking, the method comprising: training a specific class target detector to complete the initialization positioning of the target; respectively extracting features of the binocular images, establishing a relevant filtering model, and fusing; in the subsequent frame, target tracking is carried out by utilizing the fusion model; and judging that the target is lost or shielded by utilizing the related filtering response, and rediscovery by utilizing target detection.
The method comprises the following concrete steps:
step 1: training a target detector offline for a particular target class; acquiring images of the two stations during online operation, and acquiring the areas of the target in the two images by using a target detector;
step 2: respectively processing two images, firstly cutting an image sample by taking a target position as a center, extracting characteristics, training a related filtering model, and fusing the two models;
and step 3: in the subsequent frames, respectively cutting out a search image area by taking the above frame as a center in the two images, extracting the characteristics, and obtaining the position and the scale of the target by using a fusion model;
and 4, step 4: and setting a response threshold, judging that the target is lost or blocked when the response threshold is lower than the threshold, starting the online detection of the target to re-present the target, and repeating the steps 2 and 3 until the last frame is finished when the target is re-detected.
Further, the off-line training of the target detector in step 1 and the on-line detection process of the target detector in step 4 specifically include the following steps:
(1) loading a pre-training target detector model YOLOv 3;
(2) marking the historical image, and inputting the historical image into a detection network for fine tuning training;
(3) and inputting the image to be detected into a detection network, and outputting a target area.
Furthermore, the label comprises the category of the target and a rectangular target frame, and the ith frame image is set as IiThe category of the target is C, the upper left corner of the target box (x)1,y1) And the lower right corner (x)2,y2) And obtaining the labeled content as follows: c, x1,y1,x2,y2(ii) a Dividing the marked samples according to a training set and a verification set, and then respectively training by utilizing a YOLOv3 detector network; binocular image notation
Figure BDA0002635028520000031
And
Figure BDA0002635028520000032
i belongs to {0,1,2,. and N }, and the right superscript represents a left image and a right image respectively; for the initial frame i equal to 0, the trained detector detects that the targets are respectively at
Figure BDA0002635028520000033
And
Figure BDA0002635028520000034
the target frame in (1) is BLAnd BR
Further, in step 2 and step 3, the specific steps of cutting the image are as follows:
(1) the target box is (cx, cy, w, h), where (cx, cy) is the target center and (w, h) is the width and height of the target;
(2) with (cx, cy) as the center, the side length is cut out
Figure BDA0002635028520000035
And resize to 240 x 240.
Further, in step 2 and step 3, the specific steps of feature extraction are as follows:
(1) extracting gradient vector histograms and color space features of the cut images, normalizing the scale of the feature map, and performing feature splicing to obtain multi-channel features p;
(2) and constructing a two-dimensional cosine window matrix omega, and multiplying the two-dimensional cosine window matrix omega by the multichannel feature p to obtain a final multichannel feature map x which is omega.
Further, in the step 2, the specific steps of creating the correlation filtering model are as follows:
(1) constructing a two-dimensional Gaussian matrix Y as a regression target, taking the scale as the resolution of the characteristic diagram and the Gaussian center as the center of the characteristic diagram, and performing fast Fourier transform on the two-dimensional Gaussian matrix to obtain Y;
(2) performing two-dimensional fast Fourier transform on the multi-channel characteristic diagram X to obtain X;
(3) obtaining a related filter coefficient F by using the results of the step (1) and the step (2), namely obtaining a related filter model;
(4) respectively carrying out the steps (1) to (3) on the two-station image to obtain a relevant filtering model FLAnd FRThe relevant filter model obtained by fusion is F ═ (1-mu) FL+μFRWhere μ ∈ (0,1) is a weight coefficient.
Further, the multi-channel signature graph is represented as x ∈ RM×N×DWherein M and N are the space scale of the characteristic diagram, and D is the number of characteristic channels; learning an optimal correlation filter f, setting a target confidence coefficient to obey a spatial Gaussian distribution
Figure BDA0002635028520000041
I.e., the two-dimensional gaussian matrix y, minimizes the regression loss for all training samples, defining the objective function as:
Figure BDA0002635028520000042
wherein, f is the discrimination correlation filter parameter, x represents the convolution operation, lambda is the regular term coefficient, d represents the number of characteristic channels; by converting to frequency domain calculation, the ridge regression problem is solved, and the closed solution of the obtained correlation filter is as follows:
Figure BDA0002635028520000043
wherein capital letters indicate corresponding DFT transforms, lines indicate corresponding element multiplications,
Figure BDA0002635028520000044
a conjugate transpose denoted X;
according to the above steps, two correlation filters are obtained as FLAnd FRThen, the final fused correlation filter is:
F=(1-μ)FL+μFR (3)
where μ is a weight coefficient.
Further, in step 3, the position and the scale of the target are obtained by using the fusion model, and the specific steps are as follows:
based on the target position and scale of the previous frame, cutting an image block as a target search area in the step 3, and preprocessing to obtain a multi-channel feature map x of the search area, wherein the response z of the target at each position of the search area is as follows:
Figure BDA0002635028520000051
wherein the content of the first and second substances,
Figure BDA0002635028520000052
being an inverse discrete Fourier transform, an indication of a multiplication of corresponding elements;
the response graphs of the binocular images are respectively zLAnd zRThe position of the peak value of the response map is the position of the target; and determining the scale of the target through multi-scale search, thereby respectively obtaining target frames of the target in the two images.
Further, the specific process in step 4 is as follows: setting a threshold value tau for the filter response of the binocular image, judging that target tracking fails when max (z) < tau, thus re-utilizing the detector for target re-identification, and initializing a relevant filter when a target is detected; and for the effective filter, updating the filter, adopting a linear updating mode, and if the learning rates are eta respectively, updating the filter as follows:
F=(1-η)Ft-1+ηFt (5)
where the subscript t-1 represents the filter of the previous frame and the subscript t represents the current frame.
The invention also provides a device for tracking the double-station visual target, which comprises a binocular image acquisition unit, a detection unit, a tracking unit and a servo drive unit; wherein the content of the first and second substances,
the binocular image acquisition unit is mainly used for clearly imaging a target;
the detection unit is mainly used for marking the area of the target in the image;
the tracking unit is mainly used for marking the area of the target in the subsequent frame;
the servo driving unit drives the direction of a visual axis of the lens to move along with the space position of the target mainly according to the angle of the target deviating from the center of the visual field.
Compared with the prior art, the invention has the advantages that: the double-station tracking method and the double-station tracking device provided by the invention can solve the problem of a view field blind area of single-station tracking, improve the tracking accuracy and stability, and realize automatic discovery and whole-process real-time tracking of a target after entering a view field by combining target detection.
Drawings
FIG. 1 is a flow chart of a method of detection-assisted two-station visual target tracking in the present invention;
FIG. 2 is a schematic diagram of a dual station tracking apparatus of the present invention;
Detailed Description
The invention is further illustrated by the accompanying drawings and the detailed description below.
The basic idea of the invention is as follows: automatically marking the area of the target in the image by using a target detector of YOLOv3, and entering a tracking stage; respectively modeling the two-station images by using discrimination related filtering of multi-channel features, and fusing the two-station images into an enhanced model; when the target is blocked or lost, the target can be automatically distinguished, and the target can be found again by using the detector.
As shown in fig. 1, the method steps of the present invention are implemented as follows:
1. firstly, labeling a target image, including the category and the rectangular surrounding frame of the target. Suppose that there is an ith frame image IiThe category of the target is C, the upper left corner of the target box (x)1,y1) And the lower right corner (x)2,y2) And obtaining the labeled content as follows: c, x1,y1,x2,y2. Dividing the marked samples according to a training set and a verification set, wherein the proportion of the number of the samples is 3: 1, and then trained using the YOLOv3 network, respectively.
2. Binocular image notation
Figure BDA0002635028520000061
And
Figure BDA0002635028520000062
i ∈ {0,1,2, ·, N }, the right superscript representing the left and right images, respectively. For the initial frame i equal to 0, the detector trained in step 1 detects that the targets are respectively in
Figure BDA0002635028520000063
And
Figure BDA0002635028520000064
the surrounding frame is BLAnd BR
3. For initial frames of binocular images respectively
Figure BDA0002635028520000065
And
Figure BDA0002635028520000066
the method for training the correlation filter comprises the following specific steps:
for image I, the initial target frame is B ═ cx, cy, w, h, where (cx, cy) is the target center position, and w and h are the width and height of the target, respectively. Firstly, cutting an image block, comprising the following steps:
selecting (cx, cy) as target center and side length as
Figure BDA0002635028520000071
And resampled to a size of 240 x 240.
Then, extracting a gradient direction histogram HOG and a color name CN from the image block as mixed features, normalizing the scale of the feature map, and performing feature splicing to obtain a multi-channel feature map x ∈ RM×N×DWherein M and N are the space scale of the characteristic diagram, and D is the number of characteristic channels.
Finally, learning the optimal correlation filter f, and setting the confidence coefficient of the target to obey the spatial Gaussian distribution
Figure BDA0002635028520000072
The regression loss for all training samples was minimized, defining the objective function as:
Figure BDA0002635028520000073
wherein f is a discrimination correlation filter parameter, x represents convolution operation, lambda is a regular term coefficient, and d represents the number of characteristic channels. By converting to frequency domain calculation, the ridge regression problem is solved, and the closed solution of the obtained correlation filter is as follows:
Figure BDA0002635028520000074
wherein capital letters are denoted as the corresponding DFT transforms.
According to the above steps, two correlation filters are obtained as FLAnd FRThen, the final fused correlation filter is:
F=(1-μ)FL+μFR (8)
where μ is a weight coefficient.
4. In subsequent frames, target tracking is performed using the fusion model
Respectively tracking the target of the binocular images by using the fusion model, and specifically comprising the following steps:
based on the target position and scale of the previous frame, cutting an image block as a target search area in step 3, and preprocessing to obtain a sample x, wherein the response z of the target at each position of the search area is as follows:
Figure BDA0002635028520000081
wherein the content of the first and second substances,
Figure BDA0002635028520000082
is an inverse discrete fourier transform.
The response graphs of the binocular images are respectively zLAnd zRThe peak of the response map is located at the target. Through multi-scale search, the scale of the target can be determined, so that target frames of the target in the two images are obtained respectively.
5. Judging whether the target tracking is successful or not through a response threshold value, carrying out target re-identification and updating a filter
For the filter response of the binocular image, a threshold τ is set, and when max (z) < τ, it is determined that target tracking has failed, and therefore target re-recognition is performed again using the detector, and when a target is detected, the relevant filter is initialized. For the effective filter, the filter can be updated in a linear updating mode, the learning rates are eta respectively, and then the filter is updated as follows:
F=(1-η)Ft-1+ηFt (10)
where the subscript t-1 represents the filter of the previous frame and the subscript t represents the current frame.
As shown in fig. 2, in order to implement the above tracking method, the present invention provides a dual-station visual target tracking apparatus:
the binocular image acquisition unit is mainly used for clearly imaging a target;
the detection unit is mainly used for marking the area of the target in the image in an initialization stage and a target re-identification stage;
the tracking unit is mainly used for marking the area of the target in the subsequent frame;
the servo driving unit drives the lens to move towards the space position along with the target according to the angle of the target deviating from the center of the double-station view field respectively.
Example (b):
in this embodiment, the flying target shot by the two eyes is tracked, and the scene includes the challenging factors such as illumination change, rapid movement, attitude change and the like. The computer processing platform is provided with a CPU of Intel E5-1650v2, a memory of 64GB and a GPU of NVIDIA RTX2080 Ti.
The algorithm is verified on an outdoor binocular image, a tracking target is an airplane, the video is 5000 frames in total, and in order to visually show the tracking result of the method, the tracking result is as follows:
and detecting the tracking target normally by two stations in frames 1-479, wherein in frame 480, the tracking detection of the left station is normal, the target is lost by the right station, after the position information of the target of the left station is read by the right station, the target is detected again by the right station in frame 540, and the tracking detection task is continuously executed. At frame 780, the right station is clearer than the left station image information, at frame 4822, the left station object detection disappears, the left station reads the right station object information, at frame 5000, the left station detects the object again, and continues to execute the tracking detection task.
The result shows that the algorithm can well deal with the challenging factors such as illumination change, shielding, posture change and the like, and output an accurate target area.

Claims (9)

1. The method for tracking the double-station visual target is characterized by comprising the following steps:
step 1: training a target detector offline for a particular target class; acquiring images of the two stations during online operation, and acquiring the areas of the target in the two images by using a target detector;
step 2: respectively processing two images, firstly cutting an image sample by taking a target position as a center, extracting characteristics, training a related filtering model, and fusing the two models;
and step 3: in the subsequent frames, respectively cutting out a search image area by taking the above frame as a center in the two images, extracting the characteristics, and obtaining the position and the scale of the target by using a fusion model;
and 4, step 4: and setting a response threshold, judging that the target is lost or blocked when the response threshold is lower than the threshold, starting the online detection of the target to re-present the target, and repeating the steps 2 and 3 until the last frame is finished when the target is re-detected.
2. The method for two-station visual target tracking according to claim 1, wherein the off-line training of the target detector in step 1 and the on-line detection of the target detector in step 4 comprise the following steps:
(1) loading a pre-training target detector model YOLOv 3;
(2) marking the historical image, and inputting the historical image into a detection network for fine tuning training;
(3) and inputting the image to be detected into a detection network, and outputting a target area.
3. The method of claim 2, wherein the annotation comprises a category of the target and a rectangular target frame, and the image of the ith frame is designated as IiThe category of the target is C, the upper left corner of the target box (x)1,y1) And the lower right corner (x)2,y2) And obtaining the labeled content as follows: c, x1,y1,x2,y2(ii) a Dividing the marked samples according to a training set and a verification set, and then respectively training by utilizing a YOLOv3 detector network; binocular image notation
Figure FDA0002635028510000011
And
Figure FDA0002635028510000012
the right superscript represents the left image and the right image respectively; for the initial frame i equal to 0, the trained detector detects that the targets are respectively at
Figure FDA0002635028510000013
And
Figure FDA0002635028510000014
the target frame in (1) is BLAnd BR
4. The method for double-station visual target tracking according to claim 1, wherein in the step 2 and the step 3, the specific step of cropping the image is as follows:
(1) the target box is (cx, cy, w, h), where (cx, cy) is the target center and (w, h) is the width and height of the target;
(2) with (cx, cy) as the center, the side length is cut out
Figure FDA0002635028510000021
And resize to 240 x 240.
5. The method for tracking a two-station visual target according to claim 1, wherein in the steps 2 and 3, the specific steps of feature extraction are as follows:
(1) extracting gradient vector histograms and color space features of the cut images, normalizing the scale of the feature map, and performing feature splicing to obtain multi-channel features p;
(2) and constructing a two-dimensional cosine window matrix omega, and multiplying the two-dimensional cosine window matrix omega by the multichannel feature p to obtain a final multichannel feature map x which is omega.
6. The method for two-station visual target tracking according to claim 5, wherein in the step 2, the specific steps of the creation of the correlation filtering model are:
(1) constructing a two-dimensional Gaussian matrix Y as a regression target, taking the scale as the resolution of the characteristic diagram and the Gaussian center as the center of the characteristic diagram, and performing fast Fourier transform on the two-dimensional Gaussian matrix to obtain Y;
(2) performing two-dimensional fast Fourier transform on the multi-channel characteristic diagram X to obtain X;
(3) obtaining a related filter coefficient F by using the results of the step (1) and the step (2), namely obtaining a related filter model;
(4) respectively carrying out the steps (1) to (3) on the two-station image to obtain a relevant filtering model FLAnd FRThe relevant filter model obtained by fusion is F ═ (1-mu) FL+μFRWhere μ ∈ (0,1) is a weight coefficient.
7. The method of two-station visual target tracking according to claim 6, wherein the representation of the multi-channel feature map is x ∈ R ∈M×N×DWherein M and N are the space scale of the characteristic diagram, and D is the number of characteristic channels; learning an optimal correlation filter f, setting a target confidence coefficient to obey a spatial Gaussian distribution
Figure FDA0002635028510000022
I.e., the two-dimensional gaussian matrix y, minimizes the regression loss for all training samples, defining the objective function as:
Figure FDA0002635028510000031
wherein f is the parameter of the discrimination related filter, ^ represents the convolution operation, λ is the regular coefficient, d represents the number of the characteristic channel; by converting to frequency domain calculation, the ridge regression problem is solved, and the closed solution of the obtained correlation filter is as follows:
Figure FDA0002635028510000032
wherein capital letters indicate corresponding DFT transforms, lines indicate corresponding element multiplications,
Figure FDA0002635028510000033
a conjugate transpose denoted X;
according to the above steps, two correlation filters are obtained as FLAnd FRThen, the final fused correlation filter is:
F=(1-μ)FL+μFR (3)
where μ is a weight coefficient.
8. The method for tracking a two-station visual target according to claim 7, wherein in step 3, the position and the scale of the target are obtained by using a fusion model, and the specific steps are as follows:
based on the target position and scale of the previous frame, cutting an image block as a target search area in the step 3, and preprocessing to obtain a multi-channel feature map x of the search area, wherein the response z of the target at each position of the search area is as follows:
Figure FDA0002635028510000034
wherein the content of the first and second substances,
Figure FDA0002635028510000035
being an inverse discrete Fourier transform, an indication of a multiplication of corresponding elements;
the response graphs of the binocular images are respectively zLAnd zRThe position of the peak value of the response map is the position of the target; and determining the scale of the target through multi-scale search, thereby respectively obtaining target frames of the target in the two images.
9. The method for tracking a visual target of a double station as claimed in claim 7, wherein the specific process in step 4 is as follows: setting a threshold value tau for the filter response of the binocular image, judging that target tracking fails when max (z) < tau, thus re-utilizing the detector for target re-identification, and initializing a relevant filter when a target is detected; and for the effective filter, updating the filter, adopting a linear updating mode, and if the learning rates are eta respectively, updating the filter as follows:
F=(1-η)Ft-1+ηFt (5)
where the subscript t-1 represents the filter of the previous frame and the subscript t represents the current frame.
CN202010823124.5A 2020-08-17 2020-08-17 Dual-station visual target tracking method Active CN111915653B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010823124.5A CN111915653B (en) 2020-08-17 2020-08-17 Dual-station visual target tracking method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010823124.5A CN111915653B (en) 2020-08-17 2020-08-17 Dual-station visual target tracking method

Publications (2)

Publication Number Publication Date
CN111915653A true CN111915653A (en) 2020-11-10
CN111915653B CN111915653B (en) 2024-06-14

Family

ID=73279863

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010823124.5A Active CN111915653B (en) 2020-08-17 2020-08-17 Dual-station visual target tracking method

Country Status (1)

Country Link
CN (1) CN111915653B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112767450A (en) * 2021-01-25 2021-05-07 开放智能机器(上海)有限公司 Multi-loss learning-based related filtering target tracking method and system
CN113608618A (en) * 2021-08-11 2021-11-05 兰州交通大学 Hand region tracking method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015163830A1 (en) * 2014-04-22 2015-10-29 Aselsan Elektronik Sanayi Ve Ticaret Anonim Sirketi Target localization and size estimation via multiple model learning in visual tracking
US20180182109A1 (en) * 2016-12-22 2018-06-28 TCL Research America Inc. System and method for enhancing target tracking via detector and tracker fusion for unmanned aerial vehicles
CN108875588A (en) * 2018-05-25 2018-11-23 武汉大学 Across camera pedestrian detection tracking based on deep learning
CN109461172A (en) * 2018-10-25 2019-03-12 南京理工大学 Manually with the united correlation filtering video adaptive tracking method of depth characteristic

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015163830A1 (en) * 2014-04-22 2015-10-29 Aselsan Elektronik Sanayi Ve Ticaret Anonim Sirketi Target localization and size estimation via multiple model learning in visual tracking
US20180182109A1 (en) * 2016-12-22 2018-06-28 TCL Research America Inc. System and method for enhancing target tracking via detector and tracker fusion for unmanned aerial vehicles
CN108875588A (en) * 2018-05-25 2018-11-23 武汉大学 Across camera pedestrian detection tracking based on deep learning
CN109461172A (en) * 2018-10-25 2019-03-12 南京理工大学 Manually with the united correlation filtering video adaptive tracking method of depth characteristic

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李娜;赵祥模;赵凤;刘卫华;王倩;: "基于外观模型的目标跟踪算法研究进展", 计算机工程与科学, no. 03 *
董美宝;杨涵文;郭文;马思源;郑创;: "多特征重检测的相关滤波无人机视觉跟踪", 图学学报, no. 06 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112767450A (en) * 2021-01-25 2021-05-07 开放智能机器(上海)有限公司 Multi-loss learning-based related filtering target tracking method and system
CN113608618A (en) * 2021-08-11 2021-11-05 兰州交通大学 Hand region tracking method and system
CN113608618B (en) * 2021-08-11 2022-07-29 兰州交通大学 Hand region tracking method and system

Also Published As

Publication number Publication date
CN111915653B (en) 2024-06-14

Similar Documents

Publication Publication Date Title
CN109800689B (en) Target tracking method based on space-time feature fusion learning
CN108647588A (en) Goods categories recognition methods, device, computer equipment and storage medium
CN109146911B (en) Target tracking method and device
CN105930822A (en) Human face snapshot method and system
CN108537751B (en) Thyroid ultrasound image automatic segmentation method based on radial basis function neural network
CN110189257B (en) Point cloud acquisition method, device, system and storage medium
CN112926410A (en) Target tracking method and device, storage medium and intelligent video system
CN101406390A (en) Method and apparatus for detecting part of human body and human, and method and apparatus for detecting objects
Ali et al. Visual tree detection for autonomous navigation in forest environment
Xing et al. Traffic sign recognition using guided image filtering
CN111915653A (en) Method for tracking double-station visual target
CN113936210A (en) Anti-collision method for tower crane
CN112164093A (en) Automatic person tracking method based on edge features and related filtering
Gal Automatic obstacle detection for USV’s navigation using vision sensors
CN108664918B (en) Intelligent vehicle front pedestrian tracking method based on background perception correlation filter
Hempel et al. Pixel-wise motion segmentation for SLAM in dynamic environments
CN110827327A (en) Long-term target tracking method based on fusion
CN111640138A (en) Target tracking method, device, equipment and storage medium
CN114842506A (en) Human body posture estimation method and system
CN115311680A (en) Human body image quality detection method and device, electronic equipment and storage medium
CN111899284B (en) Planar target tracking method based on parameterized ESM network
Ginhoux et al. Model-based object tracking using stereo vision
Du et al. A high-precision vision-based mobile robot slope detection method in unknown environment
Andonovski et al. Development of a novel visual feature detection-based method for aircraft door identification using vision approach
Azi et al. Car tracking technique for DLES project

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant