CN115719367A - Twin target tracking method based on space-channel cross correlation and centrality guidance - Google Patents

Twin target tracking method based on space-channel cross correlation and centrality guidance Download PDF

Info

Publication number
CN115719367A
CN115719367A CN202211459889.0A CN202211459889A CN115719367A CN 115719367 A CN115719367 A CN 115719367A CN 202211459889 A CN202211459889 A CN 202211459889A CN 115719367 A CN115719367 A CN 115719367A
Authority
CN
China
Prior art keywords
centrality
classification
regression
correlation
space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211459889.0A
Other languages
Chinese (zh)
Inventor
张建明
何宇凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha University of Science and Technology
Original Assignee
Changsha University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha University of Science and Technology filed Critical Changsha University of Science and Technology
Priority to CN202211459889.0A priority Critical patent/CN115719367A/en
Publication of CN115719367A publication Critical patent/CN115719367A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a twin target tracking method based on space-channel cross correlation and centrality guidance, which comprises the following steps of: acquiring a template and a search area in an image; respectively sending the template and the search area into a depth feature extraction network for feature extraction to obtain template features and search area features; respectively sending the template features and the search region features into two space-channel cross-correlation modules to obtain a feature map R1 suitable for a classification subnetwork and a feature map R2 suitable for a regression subnetwork; sending the feature map R1 into a classification sub-network to obtain a classification map, and sending the feature map R2 into a regression sub-network to obtain a centrality map and a regression map; and obtaining a target boundary frame according to the classification diagram, the centrality diagram and the regression diagram after optimization through loss function optimization. The invention effectively improves the accuracy and robustness of target tracking.

Description

Twin target tracking method based on space-channel cross correlation and centrality guidance
Technical Field
The invention relates to the technical field of computer vision target tracking, in particular to a twin target tracking method based on space-channel cross correlation and centrality guidance.
Background
The target tracking is a basic task of computer vision, integrates knowledge from multiple fields such as machine learning, optimization and image processing, and is widely applied to the fields such as automatic monitoring, vehicle navigation, robot sensing, man-machine interaction and enhancement realization. Although the target tracking has made a long-term progress, the influence of many factors such as illumination change, occlusion, rapid movement, deformation, scale change and similar object interference exists in a complex and changeable actual scene, and robust visual tracking still has great challenge.
In recent years, twin network based depth trackers have achieved a good balance between accuracy and speed, such as SiamFC, siamRPN, daSiamRPN, siamRPN + +, siamCAR, etc., but still suffer from a number of disadvantages: simple local convolution operation is used by the SimFC, the SimFC is essentially convolution operation, template features are directly used as convolution kernels to carry out cross-correlation operation with search features, and the number of finally obtained feature graph channels is 1; the SiamRPN + uses a deep cross-correlation operation which independently performs a cross-correlation operation on each channel of the input layer, so that the number of channels of the obtained feature map is the same as the number of channels of the search feature, but the feature information of different channels on the same spatial position is not utilized; some trackers have taken precedence over using a pixel-by-pixel cross-correlation operation in the twin network, which, as opposed to deep cross-correlation, does not take advantage of the characteristic information of different spaces at the same channel location; while SiamCAR introduces a new centrality branch in the classification sub-network to suppress the prediction frame far away from the central position, if the centrality branch shares the same feature map with the classification branch, the prediction result may be similar to that of the classification prediction, so that the two branches cannot be well distinguished, and the centrality branch is not suitable for being placed in the classification sub-network, as shown in (a) of fig. 1. Therefore, it is necessary to more effectively fuse the search area features and the template features, better utilize the characteristics of the centrality branch, and improve the high accuracy and robustness of the target tracking.
Disclosure of Invention
Technical problem to be solved
Based on the problems, the invention provides a twin target tracking method based on space-channel cross correlation and centrality guidance, which better utilizes the characteristics of centrality branches by more effectively fusing search area characteristics and template characteristics and solves the problem that the precision and robustness of target tracking are to be improved.
(II) technical scheme
Based on the technical problem, the invention provides a twin target tracking method based on space-channel cross correlation and centrality guidance, which comprises the following steps:
s1, obtaining a template and a search area in an image;
s2, respectively sending the template and the search area into a depth feature extraction network for feature extraction, and respectively obtaining template features and search area features
Figure BDA0003954953370000021
And
Figure BDA0003954953370000022
the width multiplied by the length multiplied by the number of channels are Hz multiplied by Wz multiplied by C and Hx multiplied by Wx multiplied by C respectively;
s3, respectively sending the template features and the search region features into two space-channel cross-correlation modules to obtain a feature map R1 suitable for a classification subnetwork and a feature map R2 suitable for a regression subnetwork;
s4, sending the feature graph R1 into a classification sub-network to obtain a classification graph, and sending the feature graph R2 into a regression sub-network to obtain a centrality graph and a regression graph;
s5, optimizing the steps S3-S4 through a loss function, and obtaining a predicted target boundary frame according to the optimized classification graph, the central degree graph and the regression graph;
the step S3 includes:
s31, characterizing the template
Figure BDA0003954953370000031
The method comprises the following steps of (1) dividing a space dimension into space kernels K1 comprising Hz multiplied by Wz small kernels, wherein the size of each small kernel is 1 multiplied by C; characterizing the template
Figure BDA0003954953370000032
Dividing the channel dimension into channel cores K2 comprising C small cores, wherein the size of each small core is 1 multiplied by HzWz;
s32, the search area characteristics are set
Figure BDA0003954953370000033
And performing pixel-by-pixel cross-correlation operation with the space kernel K1 to obtain a characteristic diagram F1:
Figure BDA0003954953370000034
:, pixel-by-pixel cross-correlation;
s33, performing pixel-by-pixel cross-correlation operation on the feature map F1 and the channel kernel K2 to obtain a feature map F2: f2= F1 ═ K2;
s34, splicing the characteristic diagram F1 and the characteristic diagram F2, performing 1 x 1 convolution dimensionality reduction, and sending the obtained object into a hot plug module SE-block;
s35, repeating the steps S31-S34 twice to respectively obtain a feature map R1 suitable for the classification sub-network and a feature map R2 suitable for the regression sub-network, wherein the feature maps are Hz multiplied by Wz multiplied by C.
Further, in step S1, the template is an image obtained by cutting out a first frame of image of a data set or a captured image of a camera to a specified pixel size with a target as a center; the search area is an image cut out in a specified size by taking the target position of the ith frame as the center in the ith +1 frame in the tracking process.
Further, the step S4 includes:
s41, sending the characteristic diagram R1 into a classification sub-network, wherein the classification sub-network is a CNN network and only has one classification branch, and outputtingA classification chart
Figure BDA0003954953370000041
Each point of the classification map predicts the likelihood that the point location is foreground and background.
S42, sending the feature graph R2 into a regression sub-network, wherein the regression sub-network is a CNN network and comprises a centrality branch and a boundary box regression branch, and respectively outputting the centrality graph
Figure BDA0003954953370000042
And regression graph
Figure BDA0003954953370000043
Each point of the center map predicts the probability that the location is the center of the target, and each point of the regression map predicts the distance of the point from the upper, lower, left and right sides of the bounding box.
Further, in step S5, the optimizing by the loss function includes: the centrality branch is trained with a centrality penalty, the bounding box regression branch is trained with a centrality weighted regression penalty, and the classification branch is trained with a classification penalty.
Further, the classification loss:
Figure BDA0003954953370000044
the centrality loss is:
Figure BDA0003954953370000045
the centrality weighted regression loss:
Figure BDA0003954953370000046
wherein i, j represents the coordinate position of the corresponding map,
Figure BDA0003954953370000047
a true value representing the classification label is indicated,
Figure BDA0003954953370000048
the true value representing the degree of centrality,
Figure BDA0003954953370000049
representing the result of the network prediction, N pos Represents the number of positive samples and the number of positive samples,
Figure BDA00039549533700000410
represents the set of positive samples, ioU represents the intersection ratio of the two in parentheses, B i,j ,
Figure BDA00039549533700000411
Respectively representing the predicted bounding box and the true bounding box.
Further, in step S5, the obtaining of the predicted target bounding box according to the optimized classification diagram, the optimized centrality diagram, and the optimized regression diagram includes:
multiplying the foreground part in the classification chart by the central degree chart to obtain a point with the maximum response, namely a predicted target central point; and then obtaining the distances from the points obtained by the regression graph to four edges of the predicted boundary frame, and combining the predicted target central point to obtain the predicted target boundary frame.
Further, the specified size of the template cropping is 287 × 287 pixels, and the specified size of the search region cropping is 127 × 127 pixels.
Further, the width × length × number of channels of the template feature is 13 × 13 × 256, and the width × length × number of channels of the search region feature is 25 × 25 × 256.
The invention also discloses a twin target tracking system based on space-channel cross-correlation and centrality guidance, which comprises at least one processor; and at least one memory communicatively coupled to the processor, wherein:
the memory stores program instructions executable by the processor, the processor calls the program instructions to execute the twin target tracking method based on space-channel cross-correlation and centrality guidance, and the method comprises the following functional modules:
a data acquisition module for executing the step S1;
a feature extraction module for executing the step S2;
a space-channel cross-correlation module for performing said step S3;
a classification regression module for executing the step S4;
a prediction module, configured to perform the step S5.
A non-transitory computer-readable storage medium storing computer instructions that cause the computer to perform the twin target tracking method based on space-channel cross-correlation and centrality guidance is also disclosed.
(III) advantageous effects
The technical scheme of the invention has the following advantages:
(1) The method has the advantages that the search area characteristics and the template characteristics are sent to the space-channel cross-correlation module for characteristic fusion, so that the fusion of the characteristics in two dimensions of the space and the channel is realized, the characteristic information of different channels in the same space position is utilized, the characteristic information of different spaces in the same channel position is also utilized, the search area characteristics and the template characteristics are more effectively fused, and the target tracking precision and the target tracking robustness are improved; meanwhile, the interference of similar objects is reduced, and the calculated amount is effectively reduced;
(2) The regression sub-network comprises a centrality branch and a regression branch, the centrality branch is used for guiding the whole regression sub-network, the centrality target is used for guiding the centrality branch, the centrality target is also weighted and multiplied to the IoU loss, the weight of a low-quality predicted target boundary frame far away from a central point is restrained, and the accuracy of the predicted target boundary frame is improved;
(3) The space-channel cross-correlation module performs the lightweight characteristic fusion, so that two operations can be repeatedly performed to obtain two characteristic graphs with different applicability, the two characteristic graphs are respectively used as the input of a classification sub-network and a regression sub-network to deal with different subtasks, and the accuracy and the robustness of target tracking are improved while the two tasks are better distinguished.
Drawings
The features and advantages of the present invention will be more clearly understood by reference to the accompanying drawings, which are schematic and are not to be understood as limiting the invention in any way, and in which:
FIG. 1 is a schematic diagram comparing SiamCAR and an embodiment of the method of the invention;
FIG. 2 is a general schematic diagram of a twin target tracking method based on space-channel cross-correlation and centrality guidance according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of the spatial-channel cross-correlation module portion of step S3 according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of the part of the regression subnetwork with centrality guidance in step S42 according to the embodiment of the present invention;
FIG. 5 is a comparison of the performance of the method of an embodiment of the present invention over OTB100 with other methods;
figure 6 is a comparison of the performance of the method of the present invention on the UAV123 with other methods.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
An embodiment of the present invention is a twin target tracking method based on space-channel cross correlation and centrality guidance, as shown in fig. 1 (b) and fig. 2, including the following steps:
s1, obtaining a template and a search area which are cut into a specified size in an image:
cutting out an image with a specified pixel size from a first frame of image of a data set or a picture captured by a camera by taking a target as a center to serve as a template, and cutting out an image with a set size from an i +1 th frame by taking an i-th frame target position as a center to serve as a search area in a tracking process; the specified sizes for template and search area cropping are 287 × 287 pixels and 127 × 127 pixels, respectively.
S2, respectively sending the template and the search area into a depth feature extraction network for feature extraction, and respectively obtaining template features and search area features
Figure BDA0003954953370000071
And
Figure BDA0003954953370000072
the sizes are Hz multiplied by Wz multiplied by C and Hx multiplied by Wx multiplied by C respectively, and the width multiplied by the length multiplied by the channel number of the template characteristic or the search area characteristic is represented;
the sizes in this embodiment are 13 × 13 × 256 and 25 × 25 × 256, respectively.
S3, respectively sending the template features and the search area features into two space-channel cross-correlation modules, namely SC3M, to obtain a feature map R1 suitable for classifying the sub-networks and a feature map R2 suitable for regressing the sub-networks, as shown in FIG. 3, the method comprises the following steps:
s31, characterizing the template
Figure BDA0003954953370000081
The method comprises the following steps of (1) dividing a space dimension into space kernels K1 comprising Hz multiplied by Wz small kernels, wherein the size of each small kernel is 1 multiplied by C; characterizing the template
Figure BDA0003954953370000082
Dividing the channel dimension into channel cores K2 comprising C small cores, wherein the size of each small core is 1 multiplied by HzWz;
s32, the search area characteristics are set
Figure BDA0003954953370000083
And performing pixel-by-pixel cross-correlation operation with the spatial kernel K1 to obtain a characteristic diagram F1, wherein the size of the characteristic diagram F1 is Hx multiplied by Wx multiplied by HzWz:
Figure BDA0003954953370000084
: -pixel-by-pixel cross-correlation;
this step will searchRegional characteristics
Figure BDA0003954953370000085
And features of the template
Figure BDA0003954953370000086
Sufficiently fused in spatial dimension;
s33, performing pixel-by-pixel cross-correlation operation on the feature map F1 and the channel kernel K2 to obtain a feature map F2, wherein the size of the feature map F2 is Hz multiplied by Wz multiplied by C: f2= F1 < ANG > K2;
this step will search for region features
Figure BDA0003954953370000087
And features of the template
Figure BDA0003954953370000088
Fully fused in channel dimensions;
s34, splicing the characteristic diagram F1 and the characteristic diagram F2, performing 1 multiplied by 1 convolution dimensionality reduction, and sending the obtained result into a hot plug module SE-block;
the 1 × 1 convolution and hot plug module SE-block in the step can obtain characteristic graphs suitable for different biases through optimization; the hot plug module SE-block does not change the size of the input characteristics and has the function of obtaining the relevance between the global information and the channel;
s35, repeating the steps S31-S34 twice, and respectively obtaining a feature map R1 suitable for the classification sub-network and a feature map R2 suitable for the regression sub-network due to different emphasis points of the repeated training optimization, wherein the sizes of R1 and R2 are Hz multiplied by Wz multiplied by C, and
Figure BDA0003954953370000091
the same is true;
due to the lightweight property of the feature fusion of the steps S31-S34, the operation can be repeated twice; in the SiamCAR, because the feature fusion calculation amount is large, the calculation can be performed only once, and one feature map is obtained and used for all subtasks.
S4, sending the feature graph R1 into a classification sub-network to obtain a classification graph, and sending the feature graph R2 into a regression sub-network to obtain a centrality graph and a regression graph;
s41, sending the feature map R1 into a classification sub-network, wherein the classification sub-network is a CNN network and only has one classification branch, and outputting a classification map
Figure BDA0003954953370000092
Each point of the classification map predicts the likelihood that the point location is foreground and background.
S42, sending the feature graph R2 into a regression sub-network, wherein the regression sub-network is a CNN network and comprises a centrality branch and a boundary box regression branch, and respectively outputting the centrality graph
Figure BDA0003954953370000093
And regression graph
Figure BDA0003954953370000094
Each point of the center map predicts the likelihood that the location is the center of the target, and each point of the regression map predicts the distance of the point from the bounding box, up, down, left, right, as shown in fig. 4.
The classification branch, the centrality branch and the boundary frame regression branch are all three layers of full convolution layers, and only the number of output channels is different.
S5, optimizing the steps S3-S4 through a loss function, and obtaining a predicted target boundary frame according to the optimized classification graph, the central degree graph and the regression graph;
s51, training the centrality branch by using centrality loss, training the boundary box regression branch by using centrality weighted regression loss, and training the classification branch by using classification loss, wherein the classification loss adopts a CE (Cross Entropy) loss function, and the classification loss is as follows:
Figure BDA0003954953370000101
the centrality loss adopts a BCE (Binary Cross Entropy) loss function, and the centrality loss is as follows:
Figure BDA0003954953370000102
the centrality weighted regression loss:
Figure BDA0003954953370000103
wherein i, j represents the coordinate position of the corresponding graph,
Figure BDA0003954953370000104
a true value representing the class label is shown,
Figure BDA0003954953370000105
the true value representing the centrality is,
Figure BDA0003954953370000106
representing the result of the network prediction, N pos Represents the number of positive samples and the number of positive samples,
Figure BDA0003954953370000107
represents the set of positive samples, ioU represents the intersection ratio of the two in parentheses, B i,j ,
Figure BDA0003954953370000108
Respectively representing the predicted bounding box and the true bounding box. We use the true value of centrality
Figure BDA0003954953370000109
Weighting onto the IoU regression penalty may suppress the weighting of those low quality predicted target bounding boxes that are far from the center point.
S52, obtaining a predicted target boundary box according to the optimized classification diagram, the optimized centrality diagram and the optimized regression diagram:
each point of the classification graph predicts the possibility that the point position is the foreground and the background, each point of the central graph predicts the possibility that the point position is the target center, and each point of the regression graph predicts the distance between the point and the upper, lower, left and right sides of the boundary box; therefore, the foreground part in the classification map is multiplied by the centrality map to obtain a point with the maximum response, namely a predicted target central point; and then obtaining the distances from the points to four edges of the predicted boundary frame according to the regression graph, and obtaining the predicted target boundary frame by combining the predicted target central point.
To verify the technical effect of this embodiment, the above method is verified on multiple authoritative data sets such as VOT2018, OTB100, UAV123, GOT-10k, etc., fig. 5 shows the performance comparison of the method of the present invention and other methods on OTB100, where (a) in fig. 5 represents an accurate graph on OTB2015 data set, and (b) in fig. 5 represents a success rate graph on OTB2015 data set; fig. 6 illustrates a comparison of the performance of the method of the present invention with other methods on the UAV123 data set, where (a) in fig. 6 represents an accurate plot of the method of the present invention with other methods on the UAV123 data set, and (b) in fig. 6 represents a power plot of the method of the present invention with other methods on the UAV123 data set; the abscissa of the accurate graph represents a threshold, the ordinate represents the percentage of video frames of which the distance between the central point of a target position (bounding box) estimated by a tracking algorithm and the central point of a target manually marked (ground-route) is smaller than a given threshold, the abscissa of the success rate graph represents the threshold, and the ordinate represents the percentage of frames with OS larger than the set threshold in all frames, so that the invention has better response performance on the accurate graph and the success rate graph; table 1 of the method of the present invention shows that, compared with other methods, the method of the present invention compares performance on the VOT2018 data set, where the larger the value of accuracy, the higher the accuracy, the larger the value of robustness, and the worse stability, EAO (abbreviation of Expected Average overlay) indicates a non-reset Overlap expectation, and the larger the value, the better the performance, and the values in the table can show that the accuracy of the method of the present invention is second only to siamrnn, the EAO reaction performance is the best, robustness is also better, and overall performance is excellent; table 2 shows a comparison of the performance of the method of the invention on a GOT-10k dataset with other methods, AO representing the average overlap between all estimated bounding boxes and the ground truth box, SR0.5 representing the rate of successfully tracked frames with an overlap of more than 0.5, SR0.75 representing the overlapFrames above 0.75, FPS indicates the number of frames transmitted per second, and the AO value of the method of the present invention is second only to SiamGAT + +, SR- 0.50 Second only to SiamGAT + + and RBO, SR 0.75 The value is optimal, the FPS value is high, and the overall performance of target tracking is good.
TABLE 1
Figure BDA0003954953370000121
TABLE 2
Figure BDA0003954953370000122
The embodiment of the invention also provides a twin target tracking system based on space-channel cross correlation and centrality guidance, which can realize the twin target tracking method based on space-channel cross correlation and centrality guidance, and comprises a processor and a storage medium, wherein the storage medium is used for storing instructions; the processor executes the twin target tracking method based on the space-channel cross correlation and the centrality guidance, and comprises the following functional modules: a data acquisition module for executing the step S1; a feature extraction module for executing the step S2; a space-channel cross-correlation module for performing said step S3; a classification regression module for executing the step S4; a prediction module, configured to perform the step S5.
The twin target tracking method described above may be converted into software program instructions, either implemented using a twin target tracking system comprising a processor and a memory to operate or implemented by computer instructions stored in a non-transitory computer readable storage medium. The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
In summary, the twin target tracking method based on space-channel cross correlation and centrality guidance has the following beneficial advantages:
(1) The method has the advantages that the search area characteristics and the template characteristics are sent to the space-channel cross-correlation module for characteristic fusion, so that the fusion of the characteristics in two dimensions of the space and the channel is realized, the characteristic information of different channels in the same space position is utilized, the characteristic information of different spaces in the same channel position is utilized, the search area characteristics and the template characteristics are more effectively fused, and the target tracking precision and robustness are improved; meanwhile, the interference of similar objects is reduced, and the calculated amount is effectively reduced;
(2) The regression sub-network comprises a central degree branch and regression branches, the central degree branch is used for guiding the whole regression sub-network, the central degree target is used for guiding the central degree branch, the central degree target is also weighted and multiplied to IoU loss, the weight of a low-quality predicted target boundary box far away from a central point is inhibited, and the accuracy of the predicted target boundary box is improved;
(3) The space-channel cross-correlation module performs the lightweight characteristic fusion, so that two operations can be repeatedly performed to obtain two characteristic graphs with different applicability, the two characteristic graphs are respectively used as the input of a classification sub-network and a regression sub-network to deal with different subtasks, and the accuracy and the robustness of target tracking are improved while the two tasks are better distinguished.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.

Claims (10)

1. A twin target tracking method based on space-channel cross correlation and centrality guidance is characterized by comprising the following steps of:
s1, obtaining a template and a search area in an image;
s2, respectively sending the template and the search area into a depth feature extraction network for feature extraction, and respectively obtaining template features and search area features
Figure FDA0003954953360000011
And
Figure FDA0003954953360000012
the width multiplied by the length multiplied by the channel number is Hz multiplied by Wz multiplied by C, hx multiplied by Wx multiplied by C respectively;
s3, respectively sending the template features and the search region features into two space-channel cross-correlation modules to obtain a feature map R1 suitable for a classification subnetwork and a feature map R2 suitable for a regression subnetwork;
s4, sending the feature graph R1 into a classification sub-network to obtain a classification graph, and sending the feature graph R2 into a regression sub-network to obtain a centrality graph and a regression graph;
s5, optimizing the steps S3-S4 through a loss function, and obtaining a predicted target boundary frame according to the classification chart, the centrality chart and the regression chart after optimization;
the step S3 includes:
s31, characterizing the template
Figure FDA0003954953360000013
The method comprises the following steps of (1) dividing a space dimension into space kernels K1 comprising Hz multiplied by Wz small kernels, wherein the size of each small kernel is 1 multiplied by C; characterizing the template
Figure FDA0003954953360000014
Dividing the channel dimension into channel cores K2 comprising C small cores, wherein the size of each small core is 1 multiplied by HzWz;
s32, the search area characteristics are set
Figure FDA0003954953360000015
And performing pixel-by-pixel cross-correlation operation with the space kernel K1 to obtain a characteristic diagram F1:
Figure FDA0003954953360000016
:, pixel-by-pixel cross-correlation;
s33, performing pixel-by-pixel cross-correlation operation on the feature map F1 and the channel kernel K2 to obtain a feature map F2: f2= F1 ═ K2;
s34, splicing the characteristic diagram F1 and the characteristic diagram F2, performing 1 x 1 convolution dimensionality reduction, and sending the obtained object into a hot plug module SE-block;
s35, repeating the steps S31-S34 twice to respectively obtain a feature map R1 suitable for the classification sub-network and a feature map R2 suitable for the regression sub-network, wherein the feature maps are Hz multiplied by Wz multiplied by C.
2. The twin target tracking method based on spatial-channel cross-correlation and centrality guidance according to claim 1, wherein in step S1, the template is an image in which a first frame image of a dataset or a camera capture picture is cut out with a specified pixel size centering on a target; the search area is an image of which the designated size is cut out from the (i + 1) th frame by taking the target position of the (i) th frame as the center in the tracking process.
3. The twin target tracking method based on space-channel cross-correlation and centrality guidance according to claim 1, wherein the step S4 includes:
s41, sending the feature map R1 into a classification sub-network, wherein the classification sub-network is a CNN network and only has one classification branch, and outputting a classification map
Figure FDA0003954953360000021
Each point of the classification map predicts the likelihood that the point location is foreground and background.
S42, sending the feature graph R2 into a regression sub-network, wherein the regression sub-network is a CNN network and comprises a centrality branch and a boundary box regression branch, and respectively outputting the centrality graph
Figure FDA0003954953360000022
And regression graph
Figure FDA0003954953360000023
Each point of the center map predicts the probability that the location is the center of the target, and each point of the regression map predicts the distance of the point from the upper, lower, left and right sides of the bounding box.
4. The twin target tracking method based on space-channel cross-correlation and centrality guidance according to claim 1, wherein in step S5, the optimization by a loss function comprises: the centrality branch is trained with a centrality penalty, the bounding box regression branch is trained with a centrality weighted regression penalty, and the classification branch is trained with a classification penalty.
5. The twin target tracking method based on space-channel cross-correlation and centrality guidance according to claim 4, wherein the classification penalty is:
Figure FDA0003954953360000031
the centrality loss is:
Figure FDA0003954953360000032
the centrality weighted regression loss:
Figure FDA0003954953360000033
wherein i, j represents the coordinate position of the corresponding map,
Figure FDA0003954953360000034
a true value representing the classification label is indicated,
Figure FDA0003954953360000035
the true value representing the centrality is,
Figure FDA0003954953360000036
representing the result of the network prediction, N pos Represents the number of positive samples and the number of positive samples,
Figure FDA0003954953360000037
represents the set of positive samples, ioU represents the intersection ratio of the two in parentheses, B i,j ,
Figure FDA0003954953360000038
Representing the prediction bounding box and the real bounding box, respectively.
6. The twin target tracking method based on spatial-channel cross-correlation and centrality guidance according to claim 1, wherein in step S5, the obtaining of the predicted target bounding box according to the optimized classification map, centrality map and regression map comprises:
multiplying the foreground part in the classification graph by the centrality graph to obtain a point with the maximum response, namely a predicted target central point; and then obtaining the distances from the points to four edges of the predicted boundary frame according to the regression graph, and obtaining the predicted target boundary frame by combining the predicted target central point.
7. The twin target tracking method based on space-channel cross-correlation and centrality guidance of claim 2, wherein the specified size of the template clipping is 287 x 287 pixels and the specified size of the search region clipping is 127 x 127 pixels.
8. The twin target tracking method based on space-channel cross-correlation and centrality guidance according to claim 1, wherein the width x length x number of channels size of the template feature is 13 x 256, and the width x length x number of channels size of the search area feature is 25 x 256.
9. A twin target tracking system based on space-channel cross-correlation and centrality guidance comprising at least one processor; and at least one memory communicatively coupled to the processor, wherein:
the memory stores program instructions executable by the processor, the processor invoking the program instructions to be able to perform the twin target tracking method based on space-channel cross-correlation and centrality guidance according to any one of claims 1 to 8, comprising the following functional modules:
a data acquisition module for executing the step S1;
a feature extraction module for executing the step S2;
a space-channel cross-correlation module for performing said step S3;
a classification regression module for executing the step S4;
a prediction module, configured to perform the step S5.
10. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the twin target tracking method based on space-channel cross-correlation and centrality guidance according to any one of claims 1 to 8.
CN202211459889.0A 2022-11-17 2022-11-17 Twin target tracking method based on space-channel cross correlation and centrality guidance Pending CN115719367A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211459889.0A CN115719367A (en) 2022-11-17 2022-11-17 Twin target tracking method based on space-channel cross correlation and centrality guidance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211459889.0A CN115719367A (en) 2022-11-17 2022-11-17 Twin target tracking method based on space-channel cross correlation and centrality guidance

Publications (1)

Publication Number Publication Date
CN115719367A true CN115719367A (en) 2023-02-28

Family

ID=85255866

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211459889.0A Pending CN115719367A (en) 2022-11-17 2022-11-17 Twin target tracking method based on space-channel cross correlation and centrality guidance

Country Status (1)

Country Link
CN (1) CN115719367A (en)

Similar Documents

Publication Publication Date Title
US20210398294A1 (en) Video target tracking method and apparatus, computer device, and storage medium
CN107633526B (en) Image tracking point acquisition method and device and storage medium
US20170249769A1 (en) Image Distractor Detection and Processing
US20220230282A1 (en) Image processing method, image processing apparatus, electronic device and computer-readable storage medium
EP3687152B1 (en) Learning method and learning device for pooling roi by using masking parameters to be used for mobile devices or compact networks via hardware optimization, and testing method and testing device using the same
US10636176B2 (en) Real time overlay placement in videos for augmented reality applications
CN112927209B (en) CNN-based significance detection system and method
CN113486887B (en) Target detection method and device in three-dimensional scene
CN111915660A (en) Binocular disparity matching method and system based on shared features and attention up-sampling
CN114998595B (en) Weak supervision semantic segmentation method, semantic segmentation method and readable storage medium
CN113850136A (en) Yolov5 and BCNN-based vehicle orientation identification method and system
KR102201297B1 (en) Apparatus and method for interpolating frames based on multiple flows
CN114529584A (en) Single-target vehicle tracking method based on unmanned aerial vehicle aerial photography
CN113393496A (en) Target tracking method based on space-time attention mechanism
Mo et al. PVDet: Towards pedestrian and vehicle detection on gigapixel-level images
CN117011655A (en) Adaptive region selection feature fusion based method, target tracking method and system
US11647294B2 (en) Panoramic video data process
CN115719367A (en) Twin target tracking method based on space-channel cross correlation and centrality guidance
KR101919879B1 (en) Apparatus and method for correcting depth information image based on user&#39;s interaction information
CN113963204A (en) Twin network target tracking system and method
CN117197249B (en) Target position determining method, device, electronic equipment and storage medium
CN116385731A (en) Small target detection method, system, storage medium and terminal based on context information and global attention
CN116665123A (en) Multi-target tracking method based on time sequence attention and search range refinement
CN118052985A (en) Low-light video target segmentation method based on event signal driving
CN118134963A (en) Anti-background-interference twin network single-target tracking method based on hierarchical feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination