CN113298136B

CN113298136B - Twin network tracking method based on alpha divergence

Info

Publication number: CN113298136B
Application number: CN202110556609.7A
Authority: CN
Inventors: 胡旷伋; 朱虎; 邓丽珍
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2021-05-21
Filing date: 2021-05-21
Publication date: 2022-08-05
Anticipated expiration: 2041-05-21
Also published as: CN113298136A

Abstract

The invention discloses a twin network tracking method based on alpha divergence, and aims to solve the technical problem that visual tracking with high robustness and accuracy is difficult to realize in the prior art. It includes: acquiring an image to be tracked and a well-trained twin network, wherein the twin network is trained based on alpha divergence; extracting the depth features of the image to be tracked by using ResNet 50; processing the depth characteristics of the image to be tracked by using a target center regression branch to obtain the predicted target position of the image to be tracked; and respectively processing the depth characteristics of the image to be tracked by using the regression branches of the target frame to obtain a predicted target frame of the image to be tracked. The method can explain the noise and uncertainty caused by manual labeling from the perspective of probability, and has higher accuracy and robustness.

Description

Twin network tracking method based on alpha divergence

Technical Field

The invention relates to a twin network tracking method based on alpha divergence, and belongs to the technical field of image vision.

Background

With the development of the fields of communication, computers and the like, artificial intelligence has become a hot research focus of the current people. Computer vision, which allows a camera to be the "eye" of a computer to view the world, is expected to enable the computer to process the massive high-dimensional image data like the brain, and to be widely applied to industrial flaw detection, medical image processing, road safety and monitoring protection. The target tracking is a very important and basic direction in computer vision, under the condition of giving an initial state, the motion trail of a target in a video sequence is estimated, the flow which is seemingly simple integrates the knowledge of a plurality of fields such as image processing, mode recognition, probability theory, optimization theory and the like, and the method has wide application in military and civil life, such as sports event relay, unmanned vehicle, intelligent monitoring and a human-computer interaction system.

In recent years, a tracking algorithm based on a discriminant correlation filter has achieved excellent results, and tensor eigenexpressions that maintain the structure of high-dimensional image data have been introduced into the DCF. With the continuous improvement of the computing power of computers and the strong feature extraction capability of deep neural networks, a great number of researchers and science and technology companies are conducting computer vision research based on deep learning and have great success. In recent years, due to the development of visual target tracking competition, a data training basis is provided for target tracking based on deep learning by the appearance of a large number of manually labeled data sets such as OTB, TrackingNet, COCO and the like, the development of the deep learning in a target tracking algorithm is greatly stimulated, researchers increasingly integrate the traditional tracking algorithm into a deep network, the pressure of feature extraction and parameter optimization of the traditional method is relieved, and the classification performance of a target and a background is improved. However, deep learning is data-driven, and has the disadvantages of long training time, high sample requirement, and high hardware configuration requirement, and still has limitations.

Furthermore, the tracker suffers from interference from two aspects: rotation and deformation caused by the self-movement of the target, and blurring and scale change caused by rapid movement; shielding and background speckle caused by external environment. These disturbances present a number of challenges to the tracking algorithm. Achieving highly robust and accurate visual tracking remains a difficult point.

The target tracking is closely related to the life of human beings, and has a very wide application prospect, and although the technical method is continuously updated along with the technological progress to overcome various interferences, the design of a tracker with robustness and real-time performance is still a difficult task, and has important significance for the research thereof.

Disclosure of Invention

In order to solve the problem that visual tracking with high robustness and accuracy is difficult to realize in the prior art, the invention provides a twin network tracking method based on alpha divergence.

In order to solve the technical problems, the invention adopts the following technical means:

the invention provides a twin network tracking method based on alpha divergence, which comprises the following steps:

acquiring an image to be tracked and a well-trained twin network, wherein the twin network is trained based on alpha divergence;

extracting the depth features of the image to be tracked by using ResNet50 in the trained twin network;

processing the depth characteristics of the image to be tracked by using a target center regression branch in the trained twin network to obtain the predicted target position of the image to be tracked;

and respectively processing the depth characteristics of the image to be tracked by using the regression branches of the target frame in the trained twin network to obtain a predicted target frame of the image to be tracked.

Further, the training process of the twin network is as follows:

constructing a basic framework of a twin network, wherein the twin network comprises a main network adopting ResNet50, a target center regression branch and a target frame regression branch;

obtaining a training set and a test set of the twin network, wherein the training set or the test set comprises a plurality of training images or test images containing targets;

extracting depth features of training images in a training set by using ResNet50, and respectively transmitting the depth features to a target center regression branch and a target frame regression branch;

processing the depth characteristics of the training image by using the target center regression branch to obtain the predicted target position of the training image, and training the alpha divergence of the target center regression branch by using grid sampling;

processing the depth characteristics of the training image by using the regression branch of the target frame to obtain a predicted target frame of the training image, and sampling the alpha divergence of the regression branch of the training target frame by using Monte Carlo;

determining network parameters of the twin network through alpha divergence training to obtain a trained twin network;

and testing the trained twin network by using a test set.

Furthermore, in the twin network training process, the first three frames of images including the current frame are selected as a group of training images to be input into the twin network, and the last three frames of images including the current frame are selected as a group of testing images to be tested and input into the twin network.

Further, an initialization layer is adopted in the target center regression branch to initialize the convolution kernel, and an optimization layer is adopted to update the filter; and the target frame regression branch obtains a modulation vector by using a full connection layer for the depth features of the training image or the test image based on IoUnet, and then regresses the overlapping degree between each candidate window and the real target frame.

Further, the calculation formula of the alpha divergence of the target center regression branch or the target frame regression branch is as follows:

wherein p (y | x) _i θ) represents the conditional probability distribution of the target center regression branch or target box regression branch output, p (yyy |) _i ) Representing the conditional probability distribution of the true annotations in the training image, D _α [p(y|y _i )||p(y|x _i ,θ)]Denotes p (y | y) _i ) And p (y | x) _i Alpha divergence between θ), y representing the true target position or the true target frame, x _i Representing the ith training image, theta is a parameter of a target center regression branch or a target frame regression branch, alpha is a control coefficient of alpha divergence, and y _i Representing the position of the artificially labeled target or labeled target frame in the ith training image, s _θ (y,x _i ) Is represented by x _i And y is the score of the target center regression branch or the target frame regression branch output when one sample is obtained, i is 1,2, …, n, n is the number of training images in the training set.

Further, the method for training the alpha divergence of the target center regression branch by using grid sampling comprises the following steps:

dividing the confidence score graph output by the regression branch of the target center into K grids so as to ensure that

Wherein, y ^(k) Indicating the sampled target position of the kth grid point,

a set of sampled target positions representing K grid points;

and expressing the alpha divergence by using a grid sampling method and using the alpha divergence as a loss function of the target center regression branch, wherein the expression of the corresponding loss function of the ith training image in the target center regression branch is as follows:

wherein L is _i Representing the corresponding loss function of the ith training image in the target center regression branch, wherein C is 1/alpha (1-alpha), alpha is the control coefficient of alpha divergence, A is the scaling factor of the grid sampling method, and p (y) ^(k) |y _i ) Representing the conditional probability distribution, s, of the true label in the kth grid point _θ (y ^(k) ,x _i ) Is represented by x _i And y ^(k) Confidence score, x, of target center regression branch output for one sample _i Representing the ith training image, theta is a parameter of a target center regression branch, i is 1,2, …, n, and n is the number of training images in a training set;

using a loss function L _i And training the network parameters of the target center regression branch to obtain a filter for judging the target position.

Further, the method for training the alpha divergence of the regression branch of the target frame by using Monte Carlo sampling comprises the following steps:

and utilizing Monte Carlo sampling to represent alpha divergence and taking the alpha divergence as a loss function of the regression branch of the target frame, wherein the expression of the corresponding loss function of the ith training image in the regression branch of the target frame is as follows:

wherein, L' _i A control system for representing the corresponding loss function of the ith training image in the regression branch of the target frame, wherein C is 1/alpha (1-alpha), and alpha is alpha divergenceNumber, H is the number of samples of the monte carlo sample,

indicates the given labeled target box y in the h-th sample _i The true probability distribution under the condition of (a),

representing the real target box in the h-th sample,

indicates the given labeled target box y in the h-th sample _i The probability distribution of sampling under the condition of (a),

is represented by x _i And

overlap, x, of regression branch outputs of the target frame for one sample _i Representing the ith training image, theta is a parameter of a regression branch of the target frame, i is 1,2, …, n, and n is the number of training images in the training set;

utilizing loss function L' _i And training the network parameters of the regression branches of the target box.

Further, the method comprises the following steps:

and after the trained twin network tracks the images to be tracked with the preset frame number, updating the network parameters of the target center regression branch in the twin network by using the online updating sample to obtain a new trained twin network.

Further, the value range of the preset frame number is 5-20.

The following advantages can be obtained by adopting the technical means:

the invention provides a twin network tracking method based on alpha divergence, which extracts deep features of an input image by using ResNet, obtains a predicted target position and a predicted target frame of a target in the input image by using a target center regression branch and a target frame regression branch respectively, and can give a motion track of the target in a video sequence. According to the method, from the angle of probability, conditional probability distribution is used as the output of the twin network, alpha divergence is used as the loss function of the network, twin network training is carried out through a large number of data sets, the distribution of network output distribution and real marking can be fitted, further, uncertainty existing in a target area of manual marking and noise introduced by manual marking are eliminated, tracking interference is reduced, and the robustness and accuracy of target tracking are improved.

In a twin network structure, the method solves the alpha divergence of the target center regression branch and the target frame regression branch by using a grid sampling method and a Monte Carlo sampling method respectively, and the twin network can directly use the alpha divergence without receiving the interference of loss selection. In addition, the method of the invention can update network parameters in practical application, and further ensure the tracking effect of the twin network.

The target tracking effect of the method is higher than that of the existing tracker, the target tracking accuracy, success rate and speed are higher, the method has higher accuracy and robustness, and the method has very wide application prospect.

Drawings

FIG. 1 is a flowchart of the steps of an alpha divergence-based twin network tracking method of the present invention;

FIG. 2 is a network structure diagram of a twin network in an embodiment of the present invention;

FIG. 3 is a flowchart illustrating the steps of training twin networks according to an embodiment of the present invention;

FIG. 4 is a graph of the accuracy of the method and contrast tracker of the present invention on an OTB100 data set in an embodiment of the present invention;

FIG. 5 is a graph of the success rate of the method of the present invention and a comparison tracker on an OTB100 data set in an embodiment of the present invention;

FIG. 6 is a graph of the accuracy of the method and contrast tracker of the present invention on a UAV123 data set in accordance with an embodiment of the present invention;

figure 7 is a graph of the success rate of the method and comparative tracker of the present invention on a UAV123 data set in accordance with an embodiment of the present invention.

Detailed Description

The technical scheme of the invention is further explained by combining the accompanying drawings as follows:

the invention provides a twin network tracking method based on alpha divergence, which specifically comprises the following steps as shown in figure 1:

step 1, obtaining an image to be tracked and a well-trained twin network, wherein the twin network is trained based on alpha divergence. When one or more targets in one video need to be tracked and identified, images in the video can be extracted according to a time sequence, each frame of image is used as an image to be tracked, and the image is input into a trained twin network for target tracking and identification.

In the embodiment of the present invention, a network structure of the twin network is shown in fig. 2, the twin network mainly includes 1 trunk network and 2 branch networks, wherein the trunk network adopts ResNet50, the branch networks are a target center regression branch and a target frame regression branch, respectively, and the branch networks can be regarded as convolutional neural networks.

And 2, extracting the depth features of the image to be tracked by utilizing ResNet50 in the trained twin network. After each frame of image to be tracked is input into ResNet50, the features output by the third layer and the fourth layer of ResNet50 are pooled to obtain the depth features of the image to be tracked, and the depth features are input into a target center regression branch and a target frame regression branch.

And 3, processing the depth characteristic of the image to be tracked by using a target center regression branch in the trained twin network, wherein the target center regression branch is equivalent to a filter for roughly positioning the target, the depth characteristic of the image to be tracked and the filter are subjected to convolution operation to obtain a confidence score, and the position corresponding to the maximum value of the confidence score is the predicted target position of the image to be tracked.

And 4, respectively processing the depth characteristics of the image to be tracked by using target frame regression branches in the trained twin network, wherein the target frame regression branches are based on IoUnet, a plurality of candidate target frames are produced at the target position in the image to be tracked, a modulation vector is obtained by using a modulation network, and IoU scores (overlapping degree) of each candidate target frame are regressed, wherein the candidate target frame corresponding to the maximum IoU score is the predicted target frame of the image to be tracked.

And 5, after the trained twin network tracks the images to be tracked with preset frame numbers, updating the network parameters of the target center regression branch in the twin network by using the online updating sample to obtain a new trained twin network.

In order to further improve the target tracking effect of the method, the network parameters need to be updated in practical application, the method stores the image to be tracked of each frame input into the twin network, takes the recently stored 50 frames of images as online updating samples, trains the target center regression branch in the twin network again by using the online updating samples after the twin network continuously tracks the image to be tracked of the preset frame number, updates the network parameters of the filter, and uses the updated twin network to track the target in subsequent application. Wherein the value range of the preset frame number is 5-20.

In order to avoid deviation of tracking effect caused by selection problem of loss function in deep learning, the invention utilizes alpha divergence as the loss function of two branch networks to carry out twin network training, and the principle is to minimize the alpha divergence between conditional probability distribution and real labeling distribution of network output, so that the prediction distribution is approximate to the real distribution. The calculation formula of the alpha divergence of the target center regression branch or the target frame regression branch is as follows:

wherein D is _α [p(y|y _i )||p(y|x _i ,θ)]Denotes p (y | y) _i ) And p (y | x) _i Alpha divergence between theta), p (y | x) _i θ) represents the conditional probability distribution of the target center regression branch or target box regression branch output, p (yyy |) _i ) Expressing the conditional probability distribution of the true mark in the training image, y expressing the true value of the sample, and regression branching at the target centerWhere y represents the true target position, y represents the true target box in the target box regression branch, x _i The method can lead the predicted distribution and the real distribution to be more fitted by manually adjusting alpha, and y is characterized in that the ith training image is represented, theta is a parameter of a target center regression branch or a target frame regression branch, and alpha is a control coefficient of alpha divergence _i Representing the artificial annotation value in the ith training image, y in the target center regression branch _i Indicating the marked target position of the manual mark, y in the regression branch of the target frame _i Labeling target boxes, s, representing manual labeling _θ (y,x _i ) Is represented by x _i And y is the score of the target center regression branch or target box regression branch output for one sample, i.e. the branch network in image x _i Output at position y of (1), s in the target central regression branch _θ (y,x _i ) Representing confidence scores, in the target box regression branches s _θ (y,x _i ) The overlap may also be referred to as an IoU score, where i is 1,2, …, n, n is the number of training images in the training set.

p(y|x _i θ) is as follows:

Z _θ (x _i )＝∫exp(s _θ (y,x _i ))dy (6)

given network f _θ (. The translational invariance of two-dimensional images can efficiently parameterize the output s of the network _θ (y,x _i )＝f _θ (x _i )(y)。

As shown in fig. 3, the training process of the twin network is as follows:

and step A, constructing a basic framework of the twin network, wherein the specific framework is shown in figure 2.

B, obtaining a training set and a test set of the twin network, wherein the training set comprises a plurality of training images containing targets, and each training image contains an artificially labeled target position and a labeled target frame; the test set includes a plurality of test images containing the object.

When subsequent training and testing operations are carried out, the method selects the first three frames of images containing the current frame as a group of training images to be input into the twin network, and selects the last three frames of images containing the current frame as a group of testing images to be tested and input into the twin network.

And C, extracting the depth features of the training images in the training set by using ResNet50, and respectively transmitting the depth features to the target center regression branch and the target frame regression branch. Features of the input training image are extracted by using ResNet50, specifically, features of the third layer and the fourth layer of ResNet are used respectively, and the features are subjected to pooling by a pooling layer prPooling and then input into a branch network.

And D, processing the depth characteristics of the training image by using the target center regression branch to obtain the predicted target position of the training image, and training the alpha divergence of the target center regression branch by using grid sampling.

The target-centric regression branch is used for target-centric regression, where an initialization layer is used to initialize the convolution kernel (i.e., filter parameters), and the optimization layer is used to update the filter.

And D01, performing convolution operation on the depth features of the training image by using the target center regression branch to obtain a confidence score, and selecting a position corresponding to the maximum value of the confidence score as a predicted target position of the training image.

And D02, loss calculation can be carried out on the target center regression branch according to the predicted target position and the manually marked target position.

The invention uses grid sampling method to solve the integral of alpha divergence, and sets the confidence score chart of target center regression branch output to be divided into K grids to make

Wherein, y ^(k) Indicating the sampled target position of the kth grid point,

representing the truths of K grid pointsA set of target locations.

And (3) expressing the alpha divergence in the formula (4) by using a grid sampling method and using the alpha divergence as a loss function of the target center regression branch, wherein the expression of the corresponding loss function of the ith training image in the target center regression branch is as follows:

wherein L is _i Representing the corresponding loss function of the ith training image in the target center regression branch, wherein C is 1/alpha (1-alpha), A is the scaling factor of the grid sampling method, and p (y) ^(k) |y _i ) Representing the conditional probability distribution, s, of the true label in the kth grid point _θ (y ^(k) ,x _i ) Is represented by x _i And y ^(k) The confidence score of the target center regression branch output at one sample. The final loss function is the average loss over a small sample batch.

Step D03, utilizing the loss function L _i And training the network parameters of the target center regression branch to obtain a filter for judging the target position.

And E, processing the depth characteristics of the training image by using the regression branch of the target frame to obtain a predicted target frame of the training image, and sampling the alpha divergence of the regression branch of the training target frame by using Monte Carlo.

And E01, processing the depth characteristics of the training image by using the regression branch of the target frame, producing a plurality of candidate target frames at the target positions of the training image, calculating the overlapping degree between each candidate target window and the real target frame, and selecting the optimal candidate target window as the prediction target frame corresponding to the training image according to the overlapping degree.

And E02, performing loss calculation on the regression branch of the target frame according to the predicted target frame and the manually labeled target frame.

Representing the degree of overlap of regression branch outputs of the target box as

Wherein, y ^bb Representing the real border of the target, x representing the training image, the invention does not use negative log likelihood loss-logp (y) _i |x _i ,θ)＝log(∫exp(s _θ (y,x _i ))dy)-s _θ (y _i ,x _i ) Instead, the alpha divergence in equation (4) is used as the loss function, and the Monte Carlo sampling is used to solve the alpha divergence, where the reason for not using grid sampling is: grid sampling causes a large amount of calculation in target frame regression, and is difficult to popularize to a high dimension and has sampling deviation. Usually, the uncertainty of the target frame is generated when the frame is labeled manually, especially for small targets, the manual labeling can generate different labels by different labeling persons to introduce noise, and the invention assumes that the target frame y is labeled given _i Under the condition (1), the probability distribution of the true target box for sampling is:

wherein, L' _i Representing the corresponding loss function of the ith training image in the regression branch of the target frame, H is the sampling frequency of Monte Carlo sampling,

representing the real target box in the h-th sample,

is expressed as x _i And

the overlap of the target frame regression branch outputs for one sample.

When the distribution of the candidate target frame can cover the high region in the sample real conditional probability distribution and the predicted output conditional probability distribution, the task of the regression frame can be satisfied by using the mixed Gaussian model with the marked target frame as the center.

Step E03, utilizing loss function L' _i And training the network parameters of the regression branches of the target box.

And F, determining network parameters of the twin network through alpha divergence training to obtain the trained twin network.

And G, testing the trained twin network by using the test set.

To verify the effectiveness of the method of the invention, a set of comparative experiments is given below:

the hardware of the comparison experiment adopts two RTX 2080Ti display cards, one 12-core, two-process CPUs of each core and a 64G server running a memory to train and experiment.

Firstly, 50 generations of training are carried out on a COCO, GOT10K, LaSot and TrackingNet data set, each generation is trained for 1000 times, and the final network parameters of the twin network are obtained; then, the method and trackers such as STRCF, LADCF, ECO-HC, GFSDCF, ARCF-H, ARCF-HC, AutoTrackcC, SAMF, KCF, DSST, HOG-LR, BACF, Stacke + CA, SRDCF, SAMF + AT are used for carrying out comparison experiments on the data set OTB100 and the data set UAV123 respectively, and the accuracy and the success rate of the method and the comparison trackers on 2 data sets are shown in FIGS. 4-7.

As can be seen from fig. 4 and 5, the tracking effect of the method (alphaTK) on OTB100 data set is significantly higher than that of other contrast trackers, 0.3% higher than the second STRCF in the accuracy curve and 2% higher than the second LADCF in the success rate curve. As can be seen from fig. 6 and 7, the accuracy and success rate of the UAV123 data set of the method of the present invention are also higher than those of other contrast trackers, which are 4.4% higher than the second autotrack c in the accuracy curve and 14.1% higher than the second in the success rate curve. Through comparative experiments, the method disclosed by the invention can achieve a good target tracking effect on both OTB100 and UAV123 data sets, and the performance in UAV123 is more prominent.

According to the method, from the angle of probability, conditional probability distribution is used as the output of the twin network, alpha divergence is used as the loss function of the network, twin network training is carried out through a large number of data sets, the distribution of network output distribution and the distribution of real marking can be fitted, further, uncertainty existing in a target area of artificial marking and noise introduced by artificial marking are eliminated, tracking interference is reduced, and the robustness and accuracy of target tracking are improved. In a twin network structure, the method solves the alpha divergence of the target center regression branch and the target frame regression branch by using a grid sampling method and a Monte Carlo sampling method respectively, and the twin network can directly use the alpha divergence without receiving the interference of loss selection. The target tracking effect of the method is higher than that of the conventional tracker, and the method has a very wide application prospect.

The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, it is possible to make various improvements and modifications without departing from the technical principle of the present invention, and those improvements and modifications should be considered as the protection scope of the present invention.

Claims

1. A twin network tracking method based on alpha divergence is characterized by comprising the following steps:

respectively processing the depth characteristics of the image to be tracked by using the regression branches of the target frame in the trained twin network to obtain a predicted target frame of the image to be tracked;

the twin network training process is as follows:

and testing the trained twin network by using a test set.

2. The method as claimed in claim 1, wherein during the twin network training process, the first three images including the current frame are selected as a set of training images to be input into the twin network, and the last three images including the current frame are selected as a set of test images to be tested and input into the twin network.

3. The method as claimed in claim 1, wherein an initialization layer is used in said target central regression branch to initialize convolution kernel, and an optimization layer is used to update filter; and the target frame regression branch obtains a modulation vector by using a full connection layer for the depth features of the training image or the test image based on IoUnet, and then regresses the overlapping degree between each candidate window and the real target frame.

4. The twin network tracking method based on alpha divergence according to claim 1, wherein the calculation formula of the alpha divergence of the target center regression branch or the target frame regression branch is as follows:

wherein p (y | x) _i θ) represents the conditional probability distribution of the target center regression branch or target box regression branch output, p (yyy |) _i ) Representing the conditional probability distribution of the true annotations in the training image, D _α [p(y|y _i )||p(y|x _i ,θ)]Denotes p (y | y) _i ) And p (y | x) _i θ) alpha divergence between y representing the true target position or true target box, x _i Representing the ith training image, theta is a parameter of a target center regression branch or a target frame regression branch, alpha is a control coefficient of alpha divergence, and y _i Indicating the position of the artificially labeled target or labeled target frame in the ith training image, s _θ (y,x _i ) Is represented by x _i And y is the score of the target center regression branch or the target frame regression branch output when one sample is obtained, i is 1,2, …, n, n is the number of training images in the training set.

5. The twin network tracking method based on alpha divergence according to claim 1, wherein the method for training the alpha divergence of the regression branch at the center of the target by using grid sampling comprises:

dividing the confidence score chart output by the regression branch of the target center into K grids to order

Wherein, y ^(k) Indicating the sampled target position of the kth grid point,

a set of sampled target positions representing K grid points;

6. The twin network tracking method based on alpha divergence according to claim 1, wherein the method for training the alpha divergence of the regression branch of the target frame by using Monte Carlo sampling comprises:

wherein, L' _i Representing the corresponding loss function of the ith training image in the regression branch of the target frame, wherein C is 1/alpha (1-alpha), alpha is a control coefficient of alpha divergence, H is the sampling frequency of Monte Carlo sampling,

representing the real target box in the h-th sample,

is expressed as x _i And

overlap, x, of regression branch outputs of the target box for one sample _i Representing the ith training image, theta is a parameter of a regression branch of the target frame, i is 1,2, …, n is the number of training images in the training set;

7. The method of alpha divergence based twin network tracking as claimed in claim 1, further comprising the steps of:

8. The twin network tracking method based on alpha divergence as claimed in claim 7, wherein the value range of said preset number of frames is 5-20.