CN111598925B

CN111598925B - Visual target tracking method and device based on ECO algorithm and region growing segmentation

Info

Publication number: CN111598925B
Application number: CN202010410576.0A
Authority: CN
Inventors: 鹿璇; 李磊; 周严
Original assignee: Wuhan Zmvision Technology Co ltd
Current assignee: Wuhan Zmvision Technology Co ltd
Priority date: 2020-05-15
Filing date: 2020-05-15
Publication date: 2023-10-17
Anticipated expiration: 2040-05-15
Also published as: CN111598925A

Abstract

The invention provides a visual target tracking method and a visual target tracking device based on an ECO algorithm and region growing segmentation, wherein the method comprises the following steps: reading a first frame image of a video or image sequence, and determining a tracking target; taking the current frame tracking target as a sample, and putting the sample into an ECO algorithm for training a correlation filter; reading the next frame of image, and obtaining a target position by utilizing a trained relevant filter convolution image candidate region; judging whether a target in the image is shielded or deformed; if shielding or deformation occurs, dividing the candidate region, and taking the divided target as a sample to replace a sample generated by an ECO algorithm; updating a sample set of the ECO algorithm by using a sample fusion updating strategy; training an ECO algorithm by using the acquired sample set, and updating a correlation filter; the above steps are circularly performed. According to the invention, when the target is shielded or deformed, the ECO algorithm sample set is updated by combining region growing segmentation, so that the training effect is improved.

Description

Visual target tracking method and device based on ECO algorithm and region growing segmentation

Technical Field

The invention relates to the field of computer vision application target tracking, in particular to a visual target tracking device based on an ECO algorithm and region growth segmentation.

Background

The visual target tracking is an important research direction in computer vision, has extremely wide application in the fields of military unmanned aerial vehicle, accurate guidance, air early warning and civil video monitoring, man-machine interaction, unmanned and the like, but the target tracking faces the challenges of target scale transformation, heavy shielding, rapid movement, beyond-view and illumination change and the like, so that the reliable real-time visual target tracking method is provided and has important practical significance.

The traditional related filtering visual target tracking algorithm mainly uses a given sample to train a judging classifier to judge whether the target is tracked or surrounding background information. The method mainly uses a rotation matrix to collect samples, uses fast Fourier transform to accelerate calculation of an algorithm, and an ECO target tracking algorithm focuses on solving the problem of oversized model on the basis, and reduces the model updating frequency to accelerate the tracking speed and resist model drift by reducing parameters, simplifying a training set. The method has good tracking effect on slight shielding, posture change and illumination change of the target, and is easy to cause tracking failure under the conditions of heavy shielding, rapid movement, out of view and the like of the target in the long-term tracking process of the target.

Disclosure of Invention

The invention aims to provide a visual target tracking method and device based on an ECO algorithm and region growing segmentation, which are used for solving the problem that training effect is affected due to incorrect sample acquisition of the ECO target tracking algorithm under the condition of target shielding or deformation.

The invention is realized in the following way:

in one aspect, the invention provides a visual target tracking method based on an ECO algorithm and region growing segmentation, comprising the following steps:

s1, reading a first frame image of a video or image sequence, and determining a tracking target;

s2, taking a current frame tracking target as a sample, and putting the sample into an ECO algorithm to train a correlation filter;

s3, reading a next frame of image, and acquiring a confidence coefficient map of the image candidate region by utilizing the features corresponding to the convolved image candidate region of the trained correlation filter, wherein the position with the maximum probability of the confidence coefficient map corresponds to the target position of the current frame of image and the confidence coefficient probability of the target position;

s4, judging whether a target in the image is shielded or deformed;

s5, dividing the candidate region if shielding or deformation occurs, taking the divided target as a sample to replace a sample generated by an ECO algorithm, and if shielding or deformation does not occur, still using the sample generated by the ECO algorithm;

s6, updating a sample set of the ECO algorithm by using a sample fusion updating strategy;

s7, training an ECO algorithm by using the acquired sample set, and updating a correlation filter;

and S8, circularly executing the steps S3-S7 until the current video or image sequence is processed.

Further, the step S2 specifically includes:

firstly extracting multi-scale features x including hog features and color cn features from a candidate region z where a tracking target is located, performing pca dimension reduction on the extracted features, and then converting a time domain into a frequency domain through Fourier transformation, wherein a Fourier transformation formula is as follows:

wherein the method comprises the steps ofN, M corresponds to the number of rows and columns of the two-dimensional discrete features;

the eigenvalues after fourier transform are further inserted into the continuous domain as follows:

wherein X is _d Is a layer d feature, N _d Line number for each layer of features, b _d For the weight calculated in advance, T is the right side of the continuous interval, and then the convolution operation is performed on each feature map and the correlation filter, and the formula is as follows:

wherein f ^d Corresponding to the layer d characteristic channel corresponding to the correlation filter,continuous domain feature map for layer d feature channel,/for layer d feature channel>For all characteristic channels>Confidence score for sample x;

solving the correlation filter is accomplished by minimizing the loss function as follows:

wherein y is _j For label, i.e. generated correspondingly from the current training sample and subject to gaussian distribution, alpha _j For training sample x _j The corresponding weights, w, are regularization factors.

Further, the step S4 specifically includes:

and converting the target area identified by the current frame into a binary image, dividing the binary image into a plurality of small blocks, finding a non-shielding target judged last time, equally dividing the blocks and subtracting the corresponding blocks, and if the difference value of each small block is larger than a set threshold value, considering that shielding or deformation is generated.

Further, in the step S5, the candidate region is segmented using a region growing segmentation algorithm, where the region growing segmentation algorithm is as follows:

(1) Randomly selecting a pixel point (x 0, y 0) from four sides of a target area identified by an ECO algorithm;

(2) Taking (x 0, y 0) as a center, considering that the absolute value of the difference between the gray values of the (x, y) and seed pixels of the 4 neighborhood or 8 neighborhood pixels of (x 0, y 0) is smaller than a certain threshold T, if the condition is met, merging (x, y) with (x 0, y 0), and pushing (x, y) into a stack;

(3) Fetching a pixel from the stack and returning it to step (2) as (x 0, y 0);

(4) Returning to step (1) when the stack is empty;

(5) Repeating the steps (1) - (4) until each point in the image is attributed, and ending the growth.

Further, the sample fusion update policy used in step S6 specifically includes:

setting the maximum sample number in the sample set, if:

(1) The sample set is not full, and a new sample is placed at a vacant place;

(2) The sample set is full, and samples with a weight lower than a threshold value are in the sample set, and at the moment, the old samples are replaced by the new samples;

(3) The sample set is full, and the minimum value in the distances between the new sample and all the existing samples is smaller than the minimum value between the distances of the existing samples, and the new sample and the latest sample are fused at the moment;

(4) The sample set is full, the minimum value of the distances between the new sample and all the existing samples is larger than the minimum value of the distances between the existing samples, at the moment, the two nearest existing samples are fused, and the new sample is placed at the blank of the fused sample set.

On the other hand, the invention also provides a visual target tracking device based on the ECO algorithm and the region growing segmentation, which comprises the following components:

the tracking target determining module is used for reading a first frame image of the video or image sequence and determining a tracking target;

the filter training module is used for taking the current frame tracking target as a sample and putting the sample into an ECO algorithm to train a relevant filter;

the target position determining module is used for reading the next frame of image, and acquiring a confidence coefficient map of the image candidate region by utilizing the features corresponding to the convolved image candidate region of the trained correlation filter, wherein the maximum probability position of the confidence coefficient map corresponds to the target position of the current frame of image and the confidence coefficient probability of the current frame of image;

the shielding deformation judging module is used for judging whether a target in the image is shielded or deformed;

the sample determining module is used for dividing the candidate region if shielding or deformation occurs, taking the divided target as a sample to replace the sample generated by the ECO algorithm, and still using the sample generated by the ECO algorithm if shielding or deformation does not occur;

the sample set updating module is used for updating the sample set of the ECO algorithm by using a sample fusion updating strategy;

the filter updating module is used for training an ECO algorithm by using the acquired sample set and updating the related filter;

the circulation control module is used for controlling the target position determining module, the shielding deformation judging module, the sample determining module, the sample set updating module and the filter updating module to be executed in a circulation mode until the current video or image sequence is processed.

Further, the filter training module is specifically configured to:

Further, the shielding deformation judging module is specifically configured to:

Further, the sample determination module specifically uses a region growing segmentation algorithm to segment the candidate region, and the region growing segmentation algorithm comprises the following steps:

(3) Fetching a pixel from the stack and returning it to step (2) as (x 0, y 0);

(4) Returning to step (1) when the stack is empty;

Further, the sample fusion updating strategy used by the sample set updating module is specifically:

setting the maximum sample number in the sample set, if:

(1) The sample set is not full, and a new sample is placed at a vacant place;

Compared with the prior art, the invention has the following beneficial effects:

according to the visual target tracking method and device based on the ECO algorithm and the region growing segmentation, the ECO target tracking algorithm is adopted by the algorithm main body, the region growing segmentation algorithm is added at the same time, the situation that when the target is shielded or deformed, the target in an ECO algorithm frame is easy to deviate, the target tracking is affected when the target is not shielded or deformed, and then when the target is shielded or deformed, a target sample with a more complete shape contour is obtained by combining local region growing segmentation, the sample set of the ECO algorithm is updated, and the training effect is improved.

Drawings

FIG. 1 is a flowchart of a visual target tracking method based on an ECO algorithm and region growing segmentation provided by an embodiment of the invention;

fig. 2 is a block diagram of a visual target tracking device based on an ECO algorithm and region growing segmentation according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

As shown in fig. 1, an embodiment of the present invention provides a visual target tracking method based on an ECO algorithm and region growing segmentation, including the following steps:

s1, reading a first frame image of a video or image sequence, and manually setting or determining a tracking target through a target detection algorithm.

S2, taking a current frame tracking target as a sample, and putting the sample into an ECO algorithm to train a correlation filter; the method specifically comprises the following steps:

wherein y is _j For label, i.e. generated correspondingly from the current training sample and subject to gaussian distribution, alpha _j For training sample x _j Corresponding weights, w is regularAnd (5) a chemokines.

S3, reading a next frame of image, and acquiring a confidence coefficient map of the image candidate region by utilizing the features corresponding to the convolved image candidate region of the trained correlation filter, wherein the position with the maximum probability of the confidence coefficient map corresponds to the target position of the current frame of image and the confidence coefficient probability of the target position.

S4, judging whether a target in the image is shielded or deformed; the method specifically comprises the following steps:

and converting the target area identified by the current frame into a binary image, dividing the binary image into 20 small blocks, finding a non-shielding target judged last time, equally dividing the blocks and subtracting the corresponding blocks, and if the difference value of each small block is larger than a set threshold value, considering that shielding or deformation is generated. In other embodiments, the number of the blocks is not limited to 20, and can be reasonably selected according to practical situations.

S5, dividing the candidate region if shielding or deformation occurs, taking the divided target as a sample to replace a sample generated by an ECO algorithm, and if shielding or deformation does not occur, still using the sample generated by the ECO algorithm; the candidate regions are preferably segmented using a region growing segmentation algorithm which proceeds as follows:

(2) Taking (x 0, y 0) as a center, considering that the absolute value of the difference between the gray values of the (x, y) and seed pixels of the 4 neighborhood or 8 neighborhood pixels of (x 0, y 0) is smaller than a certain threshold value T, if the condition is met, (x, y) and (x 0, y 0) are combined (in the same area), and (x, y) is pushed onto the stack;

(3) Fetching a pixel from the stack and returning it to step (2) as (x 0, y 0);

(4) Returning to step (1) when the stack is empty;

S6, updating a sample set of the ECO algorithm by using a sample fusion updating strategy; the sample fusion updating strategy specifically comprises the following steps:

setting the maximum sample number in the sample set, if:

(1) The sample set is not full, and a new sample is placed at a vacant place;

S7, training an ECO algorithm by using the acquired sample set, and updating the correlation filter.

Based on the same inventive concept, the embodiment of the invention also provides a visual target tracking device based on the ECO algorithm and the region growing segmentation, and because the principle of the device for solving the technical problem is similar to that of the embodiment of the method, the implementation of the device can refer to the implementation of the method, and the repetition is omitted.

As shown in fig. 2, a visual target tracking device based on an ECO algorithm and region growing segmentation according to an embodiment of the present invention includes:

In one embodiment, the filter training module is specifically configured to:

In one embodiment, the shielding deformation determining module is specifically configured to:

In one embodiment, the sample determination module segments the candidate region specifically using a region growing segmentation algorithm that proceeds as follows:

(3) Fetching a pixel from the stack and returning it to step (2) as (x 0, y 0);

(4) Returning to step (1) when the stack is empty;

In one embodiment, the sample fusion update policy used by the sample set update module is specifically:

setting the maximum sample number in the sample set, if:

(1) The sample set is not full, and a new sample is placed at a vacant place;

In summary, according to the visual target tracking method and device based on the ECO algorithm and the region growing segmentation provided by the embodiment of the invention, the ECO target tracking algorithm is adopted by the algorithm main body, the region growing segmentation algorithm is added at the same time, the target in the ECO algorithm frame is observed to be easy to deviate when the target is shielded or deformed, so that the target tracking is affected when the target is not shielded or deformed, and then a target sample with a more complete shape contour is obtained by combining with the local region growing segmentation when the target is shielded or deformed, the sample set of the ECO algorithm is updated, and the training effect is improved.

Those of ordinary skill in the art will appreciate that all or a portion of the steps in the various methods of the embodiments may be implemented by a program that instructs associated hardware, the program may be stored on a computer readable storage medium, the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

Claims

1. The visual target tracking method based on the ECO algorithm and the region growing segmentation is characterized by comprising the following steps of:

s4, judging whether a target in the image is shielded or deformed;

s5, dividing the candidate region based on a region growing dividing algorithm if shielding or deformation occurs, taking the divided target as a sample to replace a sample generated by an ECO algorithm, and still using the sample generated by the ECO algorithm if shielding or deformation does not occur;

s8, executing the steps S3-S7 in a circulating way until the current video or image sequence is processed.

2. The method for visual target tracking based on the ECO algorithm and the region growing partition as defined in claim 1, wherein said step S2 comprises the following steps:

wherein X (n, m) is the discrete feature after dimension reduction, and N, M corresponds to the number of rows and columns of the two-dimensional discrete feature;

wherein f ^d Corresponding to the d layer characteristic channel corresponds to the relevant filter, J _d {x ^d The d-th layer characteristic channel continuous domain characteristic diagram, x is all characteristic channels, S _f { x } is the confidence of the score for sample x;

3. The method for visual target tracking based on the ECO algorithm and the region growing partition as defined in claim 1, wherein said step S4 comprises the following steps:

4. The method for visual target tracking based on the ECO algorithm and the region-growing segmentation according to claim 1, wherein the candidate region is segmented in step S5 using a region-growing segmentation algorithm, the region-growing segmentation algorithm comprises the following steps:

(3) Fetching a pixel from the stack and returning it to step (2) as (x 0, y 0);

(4) Returning to step (1) when the stack is empty;

5. The method for tracking a visual target based on the ECO algorithm and the region growing segmentation as set forth in claim 1, wherein the sample fusion update strategy used in step S6 is specifically:

setting the maximum sample number in the sample set, if:

(1) The sample set is not full, and a new sample is placed at a vacant place;

6. A visual target tracking apparatus based on an ECO algorithm and region growing segmentation, comprising:

the sample determining module is used for dividing the candidate region based on the region growing dividing algorithm if shielding or deformation occurs, taking the divided target as a sample to replace the sample generated by the ECO algorithm, and still using the sample generated by the ECO algorithm if shielding or deformation does not occur;

7. The ECO algorithm and region growing segmentation based visual target tracking device of claim 6, wherein said filter training module is specifically configured to:

8. The ECO algorithm and region growing segmentation based visual target tracking device of claim 6, wherein said occlusion distortion determination module is specifically configured to:

9. The ECO algorithm and region growing segmentation based visual target tracking device of claim 6, wherein the sample determination module segments candidate regions specifically using a region growing segmentation algorithm, the region growing segmentation algorithm process being as follows:

(3) Fetching a pixel from the stack and returning it to step (2) as (x 0, y 0);

(4) Returning to step (1) when the stack is empty;

10. The ECO algorithm and region growth segmentation based visual target tracking device of claim 6, wherein the sample set update module uses a sample fusion update strategy of:

setting the maximum sample number in the sample set, if

(1) The sample set is not full, and a new sample is placed at a vacant place;