CN111508002A - Small-sized low-flying target visual detection tracking system and method thereof - Google Patents
Small-sized low-flying target visual detection tracking system and method thereof Download PDFInfo
- Publication number
- CN111508002A CN111508002A CN202010309617.7A CN202010309617A CN111508002A CN 111508002 A CN111508002 A CN 111508002A CN 202010309617 A CN202010309617 A CN 202010309617A CN 111508002 A CN111508002 A CN 111508002A
- Authority
- CN
- China
- Prior art keywords
- target
- frame
- tracking
- unit
- detection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 103
- 238000000034 method Methods 0.000 title claims abstract description 67
- 230000000007 visual effect Effects 0.000 title abstract description 8
- 238000012549 training Methods 0.000 claims abstract description 42
- 238000012937 correction Methods 0.000 claims abstract description 31
- 238000010276 construction Methods 0.000 claims abstract description 15
- 238000012216 screening Methods 0.000 claims abstract description 10
- 238000007781 pre-processing Methods 0.000 claims abstract description 9
- 238000009966 trimming Methods 0.000 claims abstract description 4
- 238000012986 modification Methods 0.000 claims abstract description 3
- 230000004048 modification Effects 0.000 claims abstract description 3
- 230000008569 process Effects 0.000 claims description 28
- 238000000605 extraction Methods 0.000 claims description 24
- 238000011176 pooling Methods 0.000 claims description 20
- 230000006870 function Effects 0.000 claims description 17
- 230000004044 response Effects 0.000 claims description 16
- 238000011156 evaluation Methods 0.000 claims description 12
- 230000007774 longterm Effects 0.000 claims description 9
- 238000013507 mapping Methods 0.000 claims description 8
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 6
- 238000012795 verification Methods 0.000 claims description 6
- 239000000284 extract Substances 0.000 claims description 4
- 238000002372 labelling Methods 0.000 claims description 4
- 238000012360 testing method Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 2
- 230000003213 activating effect Effects 0.000 claims 1
- 230000008859 change Effects 0.000 abstract description 6
- 238000005286 illumination Methods 0.000 abstract description 2
- 238000001914 filtration Methods 0.000 description 5
- 230000004913 activation Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a small low-flying target visual detection tracking system and a method thereof, wherein the system comprises: the device comprises a video data input unit, a video preprocessing unit, a training data construction unit, a detection model training unit, a target comparison screening unit, a detection correction unit, a reference frame initialization unit, a sample library dynamic construction unit, an online learning unit, a position fine modification unit, a decision control unit and a tracking result output unit. The method comprises the following steps: constructing a target detection network, and comparing and screening targets; target tracking online learning; dynamically constructing a classifier training sample library, and finely trimming a target tracking position; the invention has the advantages that: the tracking drift condition of the tracked target caused by factors such as shielding, scale change and illumination can be effectively relieved, and robust target tracking can be realized. The method has the capability of updating the reference frame characteristics in time according to the target change, and meanwhile, the error tracking caused by updating the reference frame characteristics can be avoided by introducing the characteristic point matching algorithm.
Description
Technical Field
The invention relates to the technical field of flight target tracking, in particular to a small low-flight target detection and tracking system based on a neural network and online learning and a detection and tracking method thereof.
Background
At present, some related methods for joint detection and tracking and some tracking methods suitable for low-speed small targets exist. The existing method realizes short-time target tracking through a related filtering method, and realizes the function of target relocation by adopting a target detection method based on a neural network when the tracking fails. The related patents and research techniques are as follows:
the invention discloses a robust long-term tracking method based on correlation filtering and target detection, which is disclosed by the invention with the application number of CN201910306616.4, and the method realizes target tracking through a correlation filtering method, uses a one-stage target detector YO L O to detect a target, uses a SURF characteristic point matching method to select a candidate frame with the highest matching point number as a target surrounding frame of a reinitialization tracker after a detection result is obtained, and finally achieves the long-term tracking effect.
The Chinese invention patent, the name is: a low-altitude slow unmanned aerial vehicle tracking method combining correlation filtering and visual saliency is disclosed in the application number: 201910117155.6, respectively; the central position of a predicted target is determined by using a prediction response graph obtained by using a correlation filtering method in a small search area, and the scale of the predicted target is determined by using a significance detection method in a large search area, so that the tracking method suitable for the low-altitude slow unmanned aerial vehicle is realized. However, the method does not perform further processing after the target tracking fails, and the precision is to be further improved.
By integrating the prior art, the problems of extreme unbalance of the target background in single target detection and tracking are not solved, the network performance is not optimal, and the tracking precision of small targets is further improved.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a small low-flying target visual detection tracking system and a method thereof, and solves the defects in the prior art.
In order to realize the purpose, the technical scheme adopted by the invention is as follows:
a small low-flying target detection tracking system, comprising: the device comprises a video data input unit, a video preprocessing unit, a training data construction unit, a detection model training unit, a target comparison screening unit, a detection correction unit, a reference frame initialization unit, a sample library dynamic construction unit, an online learning unit, a position fine modification unit, a decision control unit and a tracking result output unit.
The video data input unit is configured to: a plurality of video sequences containing targets are input and randomly divided into two parts, wherein one part is used for training a target detection model, and the other part is used for online testing of a target tracking model.
The video pre-processing unit is configured to: the method comprises the steps of finishing the early-stage video preprocessing work according to the needs of a target detection and tracking unit, specifically deleting video segments without targets for a long time, and eliminating video segments which obviously do not accord with the characteristics of small targets flying at a slow speed per hour in a low-altitude airspace.
The training data construction unit is configured to: the completeness and richness of training data are guaranteed, a training set and a verification set are constructed in a mode of extracting video frames at equal intervals, and data labeling is carried out, namely the information of the center position, the width and the height of a target in an image is determined and is used for a supervised training target detection model.
The detection model training unit is used for: a pyramid structured target detection network is created and focus loss is used to alleviate the problem of target background imbalance. And stopping the training process after the training loss function is observed to tend to be stable, and storing the model file with the optimal performance in the verification process for providing the reset box information when the target tracking fails.
The target alignment screening unit is used for: and comparing the first frame true value frame of the tracked video with the detection result by using a SURF (speeded up robust features) feature matching algorithm, and eliminating false alarms which are obviously not low-slow small targets, thereby further ensuring the robustness of long-term stable tracking.
The detection correction unit is used for: starting a detection and correction unit when the following two conditions occur, namely starting the detection and correction unit when the position confidence coefficient of a tracking frame is lower than a set threshold value, which indicates that the target tracking of the current frame fails; and secondly, automatically starting a detection and correction unit when the specified frame number interval is reached, and ensuring that the difference between the current tracking target and the target characteristic of the reference frame is not too large.
The reference frame initialization unit is configured to: according to the received target position and scale information of the reference frame, cutting out a search area according to the size 5 times of the target size, zooming the image to the specified size 288 x 288, and inputting the image blocks of the search area with the specified size, the target position scale information and the like into a dynamic construction sample library of the classifier.
The sample library dynamic construction unit is used for: and performing data enhancement on the reference frame sample, including basic operations of rotation, scaling, dithering and blurring, and receiving a sample newly added in the tracking process.
The online learning unit extracts features of samples stored in a sample base by using a deep network ResNet18 network, obtains a predicted Gaussian response graph after passing through two fully-connected layers, generates a true Gaussian label by taking a target center as a peak point of Gaussian distribution according to label information in the sample base, adjusts parameters of the two fully-connected layers on line for an optimization target by reducing a difference between the predicted Gaussian response graph and the true Gaussian label, and achieves the purposes of realizing the predicted Gaussian response graph and obtaining a predicted target position center through the feature extraction network and the fully-connected layers under the condition of no label.
The position refinement unit is configured to: the method comprises the steps of carrying out position refinement on an initial tracking result obtained by an online learning unit, obtaining a plurality of shaking frames by taking the center of a predicted target position obtained by the online learning unit of a current frame and the width and height of a target scale obtained by a previous frame as references, mapping the shaking frames to a search area of the current frame, carrying out feature extraction on the shaking frames by using an accurate region-of-interest pooling layer, splicing modulation features obtained by a reference frame, obtaining the confidence coefficient of the predicted position of each shaking frame through a full-connection layer, and combining the results of 3 shaking frames with the highest confidence coefficients to obtain the tracking result of the current frame after refinement.
The decision control unit is configured to: and in the tracking process, the target tracking state is judged according to the relation between the confidence coefficient of the predicted position of the tracking frame and a set threshold value, if the target is stably tracked, the tracking of the next frame is continued, and if the target is lost, the detection and correction unit is activated to detect the target of the current frame and update the reference frame so as to realize the long-term stable tracking of the low-slow small target.
The tracking result output unit is used for: and after traversing all the frames of the video, outputting the position and scale information of each frame.
The invention also discloses a small low-flying target visual detection tracking method, which comprises the following steps:
step 1, constructing a target detection network;
1) constructing a network structure comprising: a backbone network, a classification subnet and a regression subnet;
2) and a loss function, which solves the problem of serious imbalance of the proportion of positive and negative samples in target detection by using focus loss and reduces the weight of the simple negative samples in training.
before the video is sent to the detection and correction unit, SURF feature point matching is carried out for one time, the first frame target of the current video is subjected to feature matching with the detection result, when the number of matching points is larger than a set value, the detection result is really the target needing to be tracked at present, the detection is successful, and at the moment, the detection frame is sent to the detection and correction unit for carrying out the subsequent process.
predicting the target center position, comprising two parts of an initialization classifier and an online classification process:
1) initialization classifier
Extracting features of a data-enhanced reference frame in a sample library by using a feature extraction network, generating a two-dimensional Gaussian true value label ygt with the same size as a feature map by taking the target central position of the reference frame as a peak value, initializing a classifier according to the features and the label, reducing the distance between an actual value and a true value as much as possible by using a least square optimization algorithm, and solving a nonlinear least square problem by using a Gaussian Newton iteration method; the formula is as follows:
in the formula (1), x ∈ {1, …, M } represents the horizontal direction coordinate of the target center point of the reference frame, M is the width of the feature map, y ∈ {1, …, N } represents the vertical direction coordinate of the target center point of the reference frame, N is the height of the feature map, and σ is the Gaussian bandwidth.
2) Online classification process
Tracking the result (x) from the previous frame (t-1 th frame)t-1,yt-1,wt-1,ht-1) (wherein (x)t-1,yt-1) To estimate the center coordinates of the target, (w)t-1,ht-1) For estimating the width and height of the target), the center position of the target of the previous frame is the center of the target search area of the current frame (the t frame), the target search area is expanded to be wide and high according to a specified proportion k, and the search area (x) of the current frame is generatedt-1,yt-1,k*wt-1,k*ht-1). Then, the feature f of the search area is extracted using the feature extraction networktAfter two full connection layers, a prediction Gaussian response graph consistent with the size of the search area is generated(A mapping function representing a fully connected layer; weight1,weight2A weight coefficient matrix representing a full connection layer), and the maximum response position is the target center coordinate (x) estimated by the current framet,yt). The online training classifier fully considers the tracked target and the background area, and the classifier is continuously updated to estimate the position of the target.
Step 4, dynamically constructing a classifier training sample library
1) Setting a reference frame updating interval as T, calling a target detection unit to update a reference frame when the current frame number T can be completely divided by T, clearing all outdated samples in a sample library, using the updated reference frame to reinitialize a classifier, and simultaneously sequentially adding newly generated samples into the sample library along with the tracking process, so that the similarity between the characteristics of a target in the sample library and the currently tracked sample is higher, and the central position of the target can be accurately estimated.
2) When the confidence coefficient of the predicted position of the tracking frame is smaller than a set threshold, the current frame target tracking fails, the decision control unit sends information of the reference frame needing to be reinitialized to the detection and correction unit, and after the detection and correction unit receives the information of the detection frame of the target detection unit, the information is sent to the reference frame initialization unit, data enhancement and other operations are carried out, and finally the information is sent to a dynamically constructed sample library.
Step 5, fine trimming of the target tracking position;
the method comprises two parts of a feature extraction network and a similarity evaluation network:
1) feature extraction network
The feature extraction uses a ResNet18 network, balance and reserve the previous template information and update the current reference frame information, provide the features combining the current and historical states of the target for the neural network, improve the tracking stability, extract the search area features of the reference frame, the current frame, the image frame at the intermediate time of the reference frame and the current frame, and respectively send the search area features to the precise region-of-interest pooling layer for the similarity evaluation network to calculate the confidence coefficient of the predicted position.
2) Similarity evaluation network
The core of the similarity evaluation network is a precise region-of-interest pooling layer, the input of which comprises two parts, the first part is bilinear interpolation with interpolation coefficient of IC for an image feature map extracted by the network
IC(x,y,i,j)=max(0,1-|x-i|)×max(0,1-|y-j|) (2)
Mapping the discrete feature map to a continuous space and obtaining a feature map f (x, y)
In the formulas (2) and (3), (x, y) is the feature map center coordinate, (i, j) is the coordinate index on the feature map, wi,jThe weight corresponding to the position (i, j) on the feature map. The second part of the input is the coordinates (x) of the upper left corner of the rectangular box2,x1) And the coordinates of the lower right corner (y)2,y1). And performing accurate region-of-interest pooling operation according to the obtained continuous spatial feature map and the coordinates of the rectangular frame, and reserving the target features on the image to the maximum extent to prepare for further comparing the similarity of the reference target and the target of the historical frame. Finally, the feature map f (x, y) is doubly integratedAnd dividing by the area of the rectangular frame to obtain a precise region of interest Pooling (PrROI Pooling)
And after the characteristics of the accurate region-of-interest pooling layer are obtained, splicing the three characteristics of the reference frame, the intermediate frame and the current frame, inputting the three characteristics into the full-connection layer and outputting the final position confidence coefficient. And comparing the similarity degree of the candidate target and the historical target, and finding the maximum similar target as a tracking result.
Compared with the prior art, the invention has the advantages that:
1) the invention combines the detection and tracking method, can effectively relieve the tracking drift condition of the tracked target caused by factors such as shielding, scale change, illumination and the like, and can realize robust target tracking.
2) The method has the capability of updating the reference frame characteristics in time according to the target change, and meanwhile, the error tracking caused by updating the reference frame characteristics can be avoided by introducing the characteristic point matching algorithm.
3) The method is suitable for long-term stable tracking of low-speed small targets in optical air remote sensing images.
Drawings
FIG. 1 is a block diagram of a small low-flying target detection and tracking system according to an embodiment of the present invention;
FIG. 2 is a diagram of a target detection network architecture according to an embodiment of the present invention;
FIG. 3 is a flow chart of online learning according to an embodiment of the present invention;
fig. 4 is a flowchart of position refinement according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings by way of examples.
As shown in fig. 1, a small low-flying target visual detection tracking system includes the following units:
(1) a video data input unit. A plurality of video sequences containing targets are input and randomly divided into two parts, wherein one part is used for training a target detection model, and the other part is used for online testing of a target tracking model.
(2) And a video preprocessing unit. The method comprises the steps of finishing the video preprocessing work in the early stage according to the requirements of a target detection and tracking unit, specifically deleting video segments without targets for a long time, and removing video segments which obviously do not accord with the characteristics of low and slow small targets.
(3) And a training data construction unit. In order to ensure the completeness and richness of training data, a training set and a verification set are constructed in a mode of extracting video frames at equal intervals, and data labeling is carried out, namely information such as the central position, the width and the like of a target in an image is determined and the information is used for a supervised training target detection model.
(4) And detecting a model training unit. Due to the problems of large scale change of low-slow small-flight targets and extreme unbalance of categories in the single-target training process, a target detection network with a pyramid structure is designed, and the problem of unbalanced target backgrounds is relieved by using focus loss. And stopping the training process after the training loss function is observed to tend to be stable, and storing the model file with the optimal performance in the verification process for providing the reset box information when the target tracking fails.
(5) And a target comparison screening unit. In view of the possibility of false alarms in the target detection result, the SURF feature matching algorithm is used for comparing the first frame true value frame of the tracked video with the detection result, so that false alarms which are obviously not low-slow small targets are eliminated, and the robustness of long-term stable tracking is further ensured.
(6) A correction unit is detected. Starting a detection and correction unit under the following two conditions, namely starting the detection and correction unit when the position confidence of the tracking frame is lower than a set threshold, which indicates that the target tracking of the current frame fails; and secondly, automatically starting a detection and correction unit when the specified frame number interval is reached, and ensuring that the difference between the current tracking target and the target characteristic of the reference frame is not too large.
(7) And a reference frame initialization unit. According to the received target position and scale information of the reference frame, cutting out a search area according to the size 5 times of the target size, zooming the image to the specified size 288 x 288, and inputting the image blocks of the search area with the specified size, the target position scale information and the like into a dynamic construction sample library of the classifier.
(8) And a sample library dynamic construction unit. And performing data enhancement on the reference frame sample, including basic operations such as rotation, scaling, dithering, blurring and the like, and receiving a newly added sample in the tracking process.
(9) And an online learning unit. And performing feature extraction on samples stored in the sample library by using a deep network ResNet18, obtaining a predicted Gaussian response map after passing through two fully-connected layers, generating a true Gaussian label by taking the target center as a peak point of Gaussian distribution according to label information in the sample library, and adjusting parameters of the two fully-connected layers on line by taking a difference between the predicted Gaussian response map and the true Gaussian label as an optimization target to achieve the purposes of realizing the predicted Gaussian response map and obtaining the predicted target position center through the feature extraction network and the fully-connected layers under the condition of no label.
(10) And a position finishing unit. The method comprises the steps of carrying out position refinement on an initial tracking result obtained by an online learning unit, obtaining a plurality of shaking frames by taking the center of a predicted target position obtained by the online learning unit of a current frame and the width and height of a target scale obtained by a previous frame as references, mapping the shaking frames to a search area of the current frame, carrying out feature extraction on the shaking frames by using an accurate region-of-interest pooling layer, splicing modulation features obtained by a reference frame, obtaining the confidence coefficient of the predicted position of each shaking frame through a full-connection layer, and combining the results of 3 shaking frames with the highest confidence coefficients to obtain the tracking result of the current frame after refinement.
(11) And a decision control unit. And in the tracking process, the target tracking state is judged according to the relation between the confidence coefficient of the predicted position of the tracking frame and a set threshold value, if the target is stably tracked, the tracking of the next frame is continued, and if the target is lost, the detection and correction unit is activated to detect the target of the current frame and update the reference frame so as to realize the long-term stable tracking of the low-slow small target.
(12) And a tracking result output unit. And after traversing all the frames of the video, outputting the position and scale information of each frame.
A small-sized low-flying target visual detection tracking method comprises the following steps:
step 1, constructing a target detection network;
the detection network structure for the low-slow small target mainly comprises two parts, namely a neural network structure with a multi-stage pyramid structure and a loss function for relieving extreme unbalance of a target background in a training process:
1) network architecture
Backbone network:
the backbone network for target detection uses feature pyramid levels P3 through P7. Backbone networkThe middle P3 and P4 are composed of two parts, one part is calculated from the output of the corresponding feature extraction network ResNet (C3 and C4) through horizontal connection, the other part is to adopt a top-down approach to up-sample the deep small-size feature map to the same size as the shallow layer, and then perform the superposition operation and the convolution operation to make the output channel number still be 256, although the channel number of the feature map is always kept unchanged, the information is increased. P5 is computed from the output of the corresponding feature extraction network ResNet (C5) by cross-connection. P6 and P7 are obtained by using a convolution layer and an activation layer on the basis of C5, wherein P islIs lower than the input image by 2l(l denotes the pyramid level), all pyramid levels have C-256 channels.
And (3) classifying the subnets:
from a given pyramid level C-channel input signature mapping, the subnet applies 4 convolutional layers of 3 × 3, each layer having C256 filters and a Relu activation function, then passes through 3 × 3 convolutional layers with KA filters, and finally appends a sigmoid activation function at each spatial location, outputting a KA binary prediction.
Regression subnet:
the bounding box regression sub-network runs parallel to the target classification sub-network, and another small full convolution network is added after each pyramid level to regress the offsets of each anchor box to the true targets that may exist nearby. The design subnet of the bounding box regression is the same as the classification subnet except that it has 4A linear outputs at each spatial position, and for a anchor boxes centered at each spatial position, 4 means the relative offset between the coordinate positions of the upper left and lower right corners of the 4 output prediction anchor boxes and the corresponding positions of the truth boxes. The target classification subnet and the box regression subnet, while sharing a common structure, use separate parameters. Fig. 2 shows the main structure of the target detection network.
2) Loss function
Use of loss of focus (Focal L oss, F L)
FL(pt)=-αt(1-pt)γlog(pt) (5)
Solving the problem of serious imbalance of the proportion of positive and negative samples in target detection, reducing the weight of simple negative samples in training, wherein in the formula (5), F L (right) represents the focus loss, and ptThe method comprises the steps of classifying a target, determining a probability of classifying the target, determining a discrimination function value representing the probability of classifying the target, α representing a balance factor for balancing uneven proportion of positive and negative samples, gamma representing a balance factor for balancing classification of difficult and easy samples, wherein the loss of the easy and classified samples is reduced by taking a value of gamma larger than 0, so that the model focuses more on learning of the difficult and wrong samples, equation (6) is an expression of the probability discrimination function, y is a prediction output label of an activated function, and is between 0 and 1, and p represents a probability value of the target belonging to a label labeling class.
in order to prevent the phenomenon that tracking fails due to the fact that a target does not exist in an image or the target is detected as a false alarm by updating a reference frame through background information, SURF feature point matching is performed once before the target is sent to a detection and correction unit, feature matching is performed on a first frame target of a current video and a detection result, when the number of matching points is larger than a set value, the detection result is really the target needing to be tracked at present, the detection is successful, and at the moment, a detection frame is sent to the detection and correction unit to perform a subsequent process.
as shown in fig. 3, the on-line learning of target tracking realizes the function of predicting the target center position in real time during the tracking process. The method mainly comprises two parts of an initialization classifier and an online classification process:
1) initialization classifier
For the reference frame subjected to data enhancement in the sample library, extracting features by using a feature extraction network, and simultaneously extracting features by using a baseThe center position of the quasi-frame target is a peak value to generate a two-dimensional Gaussian truth label y with the same size as the characteristic diagramgt,
Initializing a classifier according to characteristics and labels, minimizing the distance between an actual value and a true value by using a least square optimization algorithm, solving a nonlinear least square problem by using a Gauss-Newton iteration method, wherein the basic idea of the Gauss-Newton iteration method is to use a Taylor series expansion to approximate to replace a nonlinear regression model, then continuously approximating the regression coefficient to the optimal regression coefficient of the nonlinear regression model by multiple iterations and multiple corrections of the regression coefficient, and finally minimizing the sum of squares of residuals of an original model.
2) Online classification process
As shown in FIG. 3, the tracking result (x) is based on the previous framet-1,yt-1,wt-1,ht-1) Wherein (x)t-1,yt-1) To estimate the center coordinates of the target, (w)t-1,ht-1) In order to estimate the width and height of the target, the width and height are expanded by a specified ratio k with the center position of the target in the previous frame as the center, and a search area (x) is generated from the current framet-1,yt-1,k*wt-1,k*ht-1) Then using the feature extraction network to extract the features f of the search areatAfter two full connection layers, a predicted Gaussian response graph with the same size as the search area is generatedThe maximum response position is the target center coordinate (x) estimated by the current framet,yt). The online training classifier fully considers the tracked target and the background area, and the classifier is continuously updated to estimate the position of the target.
Step 4, dynamically constructing a classifier training sample library
In view of the fact that the motion state of a low-slow small-flight target is complex, the target size and the target form change greatly in the tracking process, if only the first frame of a video is used as a supervision frame, it is obvious that dynamic changes which cannot adapt to the target in real time exist, so the following two steps are adopted to further alleviate the problem, and the specific content is as follows:
firstly, a reference frame updating interval is set to be T, a target detection unit is called to update a reference frame when T can be completely divided by T, all outdated samples in a sample library are eliminated, a classifier is reinitialized by using the updated reference frame, and simultaneously newly generated samples are sequentially added into the sample library along with the tracking process, so that the similarity between the characteristics of a target in the sample library and the currently tracked sample is higher, and the accurate estimation of the central position of the target is facilitated.
And secondly, when the confidence coefficient of the predicted position of the tracking frame is smaller than a set threshold, the current frame target tracking fails, the decision control unit sends information of the reference frame needing to be reinitialized to the detection and correction unit, and after the detection and correction unit receives the information of the detection frame of the target detection unit, the information is sent to the reference frame initialization unit, data enhancement and other operations are carried out, and finally the information is sent to a dynamically constructed sample library.
Step 5, fine trimming of target tracking position
As shown in fig. 4, the final determination of the position and scale information of the current frame tracking frame according to the target center position predicted by the classifier and the target width and height of the previous frame is the main function of the target tracking position refinement. The method mainly comprises a feature extraction network and a similarity evaluation network:
1) feature extraction network
The ResNet-18 network is still used for feature extraction, in order to fully utilize historical information, previous template information is kept in balance with current reference frame updating information, features combining the current state and the historical state of a target are provided for a neural network, the tracking stability is improved, search area features are extracted from three parts of image frames of the reference frame, the current frame, the reference frame and the current frame at the intermediate time, and the three parts are respectively sent into an accurate interested area pooling layer and used for calculating the confidence coefficient of a prediction position by a similarity evaluation network.
2) Similarity evaluation network
Degree of similarityThe core of the evaluation network is a precise region-of-interest pooling layer, and the input of the precise region-of-interest pooling layer comprises two parts, namely an image feature map extracted by using the network, wherein (i, j) is a coordinate on the feature map, and wi,jBilinear interpolation using interpolation coefficients as IC for weighting values of corresponding positions (i, j) on the feature map
IC(x,y,i,j)=max(0,1-|x-i|)×max(0,1-|y-j|) (8)
Mapping discrete feature maps to a continuous space
Second, coordinates (x) of upper left corner and lower right corner of the rectangular frame2,x1) And (y)2,y1). Performing accurate region-of-interest pooling operation according to the obtained continuous spatial feature map and the coordinates of the rectangular frame,
and maximally preserving the target characteristics on the image and preparing for further comparison of the similarity of the reference target and the target of the historical frame. And after the characteristics of the accurate region-of-interest pooling layer are obtained, splicing the three characteristics of the reference frame, the intermediate frame and the current frame, inputting the three characteristics into the full-connection layer and outputting the final position confidence coefficient. And comparing the similarity degree of the candidate target and the historical target, and finding the maximum similar target as a tracking result.
It will be appreciated by those of ordinary skill in the art that the examples described herein are intended to assist the reader in understanding the manner in which the invention is practiced, and it is to be understood that the scope of the invention is not limited to such specifically recited statements and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.
Claims (2)
1. A small low-flying target detection and tracking system, comprising: the system comprises a video data input unit, a video preprocessing unit, a training data construction unit, a detection model training unit, a target comparison screening unit, a detection correction unit, a reference frame initialization unit, a sample library dynamic construction unit, an online learning unit, a position fine modification unit, a decision control unit and a tracking result output unit;
the video data input unit is configured to: inputting a plurality of video sequences containing targets, and randomly dividing the video sequences into two parts, wherein one part is used for training a target detection model, and the other part is used for online testing of a target tracking model;
the video pre-processing unit is configured to: finishing the preliminary video preprocessing work according to the needs of a target detection and tracking unit, specifically comprising deleting video segments without targets for a long time, eliminating video segments obviously not conforming to the characteristics of small targets flying at a slow speed per hour in a low-altitude airspace, and the like;
the training data construction unit is configured to: ensuring the completeness and richness of training data, constructing a training set and a verification set by extracting video frames at equal intervals, and carrying out data labeling, namely determining the central position, width and height information of a target in an image, wherein the information is used for a supervised training target detection model;
the detection model training unit is used for: creating a target detection network with a pyramid structure, and relieving the problem of unbalanced target background by using focus loss; stopping the training process after observing that the training loss function tends to be stable, storing a model file with optimal performance in the verification process, and providing reset frame information when the target tracking fails;
the target alignment screening unit is used for: comparing the first frame true value frame of the tracked video with the detection result by using a SURF (speeded up robust features) feature matching algorithm, eliminating false alarms which are obviously not low-slow small targets, and further ensuring the robustness of long-term stable tracking;
the detection correction unit is used for: starting a detection and correction unit when the following two conditions occur, namely starting the detection and correction unit when the position confidence coefficient of a tracking frame is lower than a set threshold value, which indicates that the target tracking of the current frame fails; secondly, automatically starting a detection and correction unit when a specified frame number interval is reached, and ensuring that the difference between the current tracking target and the target characteristics of the reference frame is not too large;
the reference frame initialization unit is configured to: cutting out a search area according to the received target position and scale information of the reference frame and according to the size 5 times of the target size, zooming the image to the specified size 288 x 288, and inputting the image blocks of the search area with the specified size, the scale information of the target position and the like into a dynamic construction sample library of the classifier;
the sample library dynamic construction unit is used for: performing data enhancement on the reference frame sample, including basic operations of rotation, scaling, dithering and blurring, and receiving a newly added sample in the tracking process;
the online learning unit extracts features of samples stored in a sample library by using a deep network ResNet18 network, obtains a predicted Gaussian response map after passing through two fully-connected layers, generates a true Gaussian label by taking a target center as a peak point of Gaussian distribution according to label information in the sample library, adjusts parameters of the two fully-connected layers on line for an optimization target by reducing a difference between the predicted Gaussian response map and the true Gaussian label, and achieves the purposes of realizing the predicted Gaussian response map and obtaining a predicted target position center through the feature extraction network and the fully-connected layers under the condition of no label;
the position refinement unit is configured to: performing position refinement on an initial tracking result obtained by an online learning unit, obtaining a plurality of shaking frames by taking the center of a predicted target position obtained by the online learning unit of a current frame and the width and height of a target scale obtained by a previous frame as references, mapping the shaking frames onto a search area of the current frame, performing feature extraction on the shaking frames by using an accurate region-of-interest pooling layer, splicing modulation features obtained by a reference frame, and finally obtaining the confidence coefficient of the predicted position of each shaking frame through a full-connection layer, and merging the results of 3 shaking frames with the highest confidence coefficients to obtain the tracking result of the current frame after the refinement;
the decision control unit is configured to: judging a target tracking state through the relation between the confidence coefficient of the predicted position of the tracking frame and a set threshold value in the tracking process, if the target is stably tracked, continuing to track the next frame, and if the target is lost, activating a detection and correction unit to detect the target of the current frame and updating a reference frame so as to realize long-term stable tracking of low and slow small targets;
the tracking result output unit is used for: and after traversing all the frames of the video, outputting the position and scale information of each frame.
2. The detection and tracking method of the small-sized low-flying target detection and tracking system according to claim 1, characterized by comprising the following steps:
step 1, constructing a target detection network;
1) constructing a network structure comprising: a backbone network, a classification subnet and a regression subnet;
2) a loss function, which solves the problem of serious imbalance of the proportion of positive and negative samples in target detection by using focus loss and reduces the weight of simple negative samples in training;
step 2, comparing and screening targets;
before the video is sent to the detection and correction unit, SURF feature point matching is carried out for one time, a first frame target of a current video is subjected to feature matching with a detection result, when the number of matching points is larger than a set value, the detection result is really a target needing to be tracked at present, the detection is successful, and at the moment, a detection frame is sent to the detection and correction unit for subsequent processes;
step 3, target tracking online learning;
predicting the target center position, comprising two parts of an initialization classifier and an online classification process:
1) initialization classifier
For a reference frame subjected to data enhancement in a sample base, extracting features by using a feature extraction network, and generating a two-dimensional Gaussian true value label y with the same size as a feature map by taking the target central position of the reference frame as a peak valuegtInitializing a classifier according to the characteristics and the label, minimizing the distance between an actual value and a true value by using a least square optimization algorithm, and solving a nonlinear least square problem by using a Gauss-Newton iteration method; formula asThe following:
in the formula (1), y ∈ {1, …, M } represents the horizontal direction coordinate of the target center point of the reference frame, M is the width of the feature map, y ∈ {1, …, N } represents the vertical direction coordinate of the target center point of the reference frame, N is the height of the feature map, and sigma is the Gaussian bandwidth;
2) online classification process
Tracking the result (x) from the previous frame (t-1 th frame)t-1,yt-1,wt-1,ht-1) (wherein (x)t-1,yt-1) To estimate the center coordinates of the target, (w)t-1,ht-1) For estimating the width and height of the target), the center position of the target of the previous frame is the center of the target search area of the current frame (the t frame), the target search area is expanded to be wide and high according to a specified proportion k, and the search area (x) of the current frame is generatedt-1,yt-1,k*wt-1,k*ht-1) (ii) a Then, the feature f of the search area is extracted using the feature extraction networktAfter two full connection layers, a prediction Gaussian response graph consistent with the size of the search area is generated(A mapping function representing a fully connected layer; weight1,weight2A weight coefficient matrix representing a full connection layer), and the maximum response position is the target center coordinate (x) estimated by the current framet,yt) (ii) a The online training classifier fully considers the tracked target and the background area, and the classifier is continuously updated to estimate the position of the target;
step 4, dynamically constructing a classifier training sample library
1) Setting a reference frame updating interval as T, calling a target detection unit to update a reference frame when the current frame number T can be completely divided by T, removing all outdated samples in a sample library, using the updated reference frame to reinitialize a classifier, and simultaneously sequentially adding newly generated samples into the sample library along with the tracking process, so that the similarity between the characteristics of a target in the sample library and the currently tracked sample is higher, and the central position of the target can be accurately estimated;
2) when the confidence coefficient of the predicted position of the tracking frame is smaller than a set threshold, the current frame target tracking fails, the decision control unit sends information of a reference frame needing to be reinitialized to the detection and correction unit, and after the detection and correction unit receives the information of the detection frame of the target detection unit, the information is sent to the reference frame initialization unit, data enhancement and other operations are carried out, and finally the information is sent to a dynamically constructed sample library;
step 5, fine trimming of the target tracking position;
the method comprises two parts of a feature extraction network and a similarity evaluation network:
1) feature extraction network
The feature extraction uses a ResNet18 network, balance and reserve the information of a previous template and update the information of a current reference frame, provide the features combining the current and historical states of a target for a neural network, improve the tracking stability, extract the features of a search area for three parts of an image frame at the middle time of the reference frame, the current frame, the reference frame and the current frame, respectively send the features into an accurate region-of-interest pooling layer, and are used for calculating the confidence coefficient of a predicted position by a similarity evaluation network;
2) similarity evaluation network
The core of the similarity evaluation network is a precise region-of-interest pooling layer, the input of which comprises two parts, the first part is bilinear interpolation with interpolation coefficient of IC for an image feature map extracted by the network
IC (x, y, i, j) ═ max (0, 1- | x-i |) × max (0, 1- | y-j |) (2) maps the discrete feature map to a continuous space and obtains the feature map f (x, y)
In the formulas (2) and (3), (x, y) are characteristic diagram center seatsThe index (i, j) is the coordinate index on the feature map, wi,jThe weight value corresponding to the position (i, j) on the feature map; the second part of the input is the coordinates (x) of the upper left corner of the rectangular box2,x1) And the coordinates of the lower right corner (y)2,y1) (ii) a Performing accurate region-of-interest pooling operation according to the obtained continuous spatial feature map and the coordinates of the rectangular frame, and reserving target features on the image to the maximum extent to prepare for further comparing the similarity of the reference target and the target of the historical frame; finally, the feature map f (x, y) is doubly integratedAnd dividing by the area of the rectangular frame to obtain a precise region of interest Pooling (PrROI Pooling)
After the characteristics of the accurate region-of-interest pooling layer are obtained, the three characteristics of the reference frame, the intermediate frame and the current frame are spliced and input into the full-connection layer to output the final position confidence; and comparing the similarity degree of the candidate target and the historical target, and finding the maximum similar target as a tracking result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010309617.7A CN111508002B (en) | 2020-04-20 | 2020-04-20 | Small-sized low-flying target visual detection tracking system and method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010309617.7A CN111508002B (en) | 2020-04-20 | 2020-04-20 | Small-sized low-flying target visual detection tracking system and method thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111508002A true CN111508002A (en) | 2020-08-07 |
CN111508002B CN111508002B (en) | 2020-12-25 |
Family
ID=71869437
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010309617.7A Active CN111508002B (en) | 2020-04-20 | 2020-04-20 | Small-sized low-flying target visual detection tracking system and method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111508002B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112489081A (en) * | 2020-11-30 | 2021-03-12 | 北京航空航天大学 | Visual target tracking method and device |
CN112633162A (en) * | 2020-12-22 | 2021-04-09 | 重庆大学 | Rapid pedestrian detection and tracking method suitable for expressway outfield shielding condition |
CN112949480A (en) * | 2021-03-01 | 2021-06-11 | 浙江大学 | Rail elastic strip detection method based on YOLOV3 algorithm |
CN113012203A (en) * | 2021-04-15 | 2021-06-22 | 南京莱斯电子设备有限公司 | High-precision multi-target tracking method under complex background |
CN113449680A (en) * | 2021-07-15 | 2021-09-28 | 北京理工大学 | Knowledge distillation-based multimode small target detection method |
CN113658222A (en) * | 2021-08-02 | 2021-11-16 | 上海影谱科技有限公司 | Vehicle detection tracking method and device |
CN113724290A (en) * | 2021-07-22 | 2021-11-30 | 西北工业大学 | Multi-level template self-adaptive matching target tracking method for infrared image |
CN114066936A (en) * | 2021-11-06 | 2022-02-18 | 中国电子科技集团公司第五十四研究所 | Target reliability tracking method in small target capturing process |
CN114241008A (en) * | 2021-12-21 | 2022-03-25 | 北京航空航天大学 | Long-time region tracking method adaptive to scene and target change |
WO2022178833A1 (en) * | 2021-02-26 | 2022-09-01 | 京东方科技集团股份有限公司 | Target detection network training method, target detection method, and apparatus |
CN116596958A (en) * | 2023-07-18 | 2023-08-15 | 四川迪晟新达类脑智能技术有限公司 | Target tracking method and device based on online sample augmentation |
CN117292306A (en) * | 2023-11-27 | 2023-12-26 | 四川迪晟新达类脑智能技术有限公司 | Edge equipment-oriented vehicle target detection optimization method and device |
CN117576164A (en) * | 2023-12-14 | 2024-02-20 | 中国人民解放军海军航空大学 | Remote sensing video sea-land movement target tracking method based on feature joint learning |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7379652B2 (en) * | 2005-01-14 | 2008-05-27 | Montana State University | Method and apparatus for detecting optical spectral properties using optical probe beams with multiple sidebands |
JP2014059710A (en) * | 2012-09-18 | 2014-04-03 | Toshiba Corp | Object detection device and object detection method |
CN107862705A (en) * | 2017-11-21 | 2018-03-30 | 重庆邮电大学 | A kind of unmanned plane small target detecting method based on motion feature and deep learning feature |
CN108154118A (en) * | 2017-12-25 | 2018-06-12 | 北京航空航天大学 | A kind of target detection system and method based on adaptive combined filter with multistage detection |
CN108230367A (en) * | 2017-12-21 | 2018-06-29 | 西安电子科技大学 | A kind of quick method for tracking and positioning to set objective in greyscale video |
CN109584269A (en) * | 2018-10-17 | 2019-04-05 | 龙马智芯(珠海横琴)科技有限公司 | A kind of method for tracking target |
CN110363789A (en) * | 2019-06-25 | 2019-10-22 | 电子科技大学 | A kind of long-term visual tracking method towards practical engineering application |
CN110533691A (en) * | 2019-08-15 | 2019-12-03 | 合肥工业大学 | Method for tracking target, equipment and storage medium based on multi-categorizer |
CN110717934A (en) * | 2019-10-17 | 2020-01-21 | 湖南大学 | Anti-occlusion target tracking method based on STRCF |
-
2020
- 2020-04-20 CN CN202010309617.7A patent/CN111508002B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7379652B2 (en) * | 2005-01-14 | 2008-05-27 | Montana State University | Method and apparatus for detecting optical spectral properties using optical probe beams with multiple sidebands |
JP2014059710A (en) * | 2012-09-18 | 2014-04-03 | Toshiba Corp | Object detection device and object detection method |
CN107862705A (en) * | 2017-11-21 | 2018-03-30 | 重庆邮电大学 | A kind of unmanned plane small target detecting method based on motion feature and deep learning feature |
CN108230367A (en) * | 2017-12-21 | 2018-06-29 | 西安电子科技大学 | A kind of quick method for tracking and positioning to set objective in greyscale video |
CN108154118A (en) * | 2017-12-25 | 2018-06-12 | 北京航空航天大学 | A kind of target detection system and method based on adaptive combined filter with multistage detection |
CN109584269A (en) * | 2018-10-17 | 2019-04-05 | 龙马智芯(珠海横琴)科技有限公司 | A kind of method for tracking target |
CN110363789A (en) * | 2019-06-25 | 2019-10-22 | 电子科技大学 | A kind of long-term visual tracking method towards practical engineering application |
CN110533691A (en) * | 2019-08-15 | 2019-12-03 | 合肥工业大学 | Method for tracking target, equipment and storage medium based on multi-categorizer |
CN110717934A (en) * | 2019-10-17 | 2020-01-21 | 湖南大学 | Anti-occlusion target tracking method based on STRCF |
Non-Patent Citations (6)
Title |
---|
BORUI JIANG 等: "Acquisition of Localization Confidence for Accurate Object Detection", 《HTTPS://ARXIV.ORG/ABS/1807.11590》 * |
FAWEI YANG 等: "DDMA MIMO radar system for low, slow, and small target detection", 《THE JOURNAL OF ENGINEERING》 * |
NUSSLER, D 等: "Detection of unmanned aerial vehicles (UAV) in urban environments", 《CONFERENCE ON EMERGING IMAGING AND SENSING TECHNOLOGIES FOR SECURITY AND DEFENCE III; AND UNMANNED SENSORS, SYSTEMS, AND COUNTERMEASURES》 * |
TSUNG-YI LIN: "Focal Loss for Dense Object Detection", 《HTTPS://ARXIV.ORG/ABS/1708.02002》 * |
吴言枫 等: "复杂动背景下的"低小慢"目标检测技术", 《中国光学》 * |
李爱师: "基于相关滤波的目标跟踪算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112489081A (en) * | 2020-11-30 | 2021-03-12 | 北京航空航天大学 | Visual target tracking method and device |
CN112633162A (en) * | 2020-12-22 | 2021-04-09 | 重庆大学 | Rapid pedestrian detection and tracking method suitable for expressway outfield shielding condition |
CN112633162B (en) * | 2020-12-22 | 2024-03-22 | 重庆大学 | Pedestrian rapid detection and tracking method suitable for expressway external field shielding condition |
WO2022178833A1 (en) * | 2021-02-26 | 2022-09-01 | 京东方科技集团股份有限公司 | Target detection network training method, target detection method, and apparatus |
US12002254B2 (en) | 2021-02-26 | 2024-06-04 | Boe Technology Group Co., Ltd. | Method and apparatus of training object detection network and object detection method and apparatus |
CN112949480A (en) * | 2021-03-01 | 2021-06-11 | 浙江大学 | Rail elastic strip detection method based on YOLOV3 algorithm |
CN113012203B (en) * | 2021-04-15 | 2023-10-20 | 南京莱斯电子设备有限公司 | High-precision multi-target tracking method under complex background |
CN113012203A (en) * | 2021-04-15 | 2021-06-22 | 南京莱斯电子设备有限公司 | High-precision multi-target tracking method under complex background |
CN113449680B (en) * | 2021-07-15 | 2022-08-30 | 北京理工大学 | Knowledge distillation-based multimode small target detection method |
CN113449680A (en) * | 2021-07-15 | 2021-09-28 | 北京理工大学 | Knowledge distillation-based multimode small target detection method |
CN113724290A (en) * | 2021-07-22 | 2021-11-30 | 西北工业大学 | Multi-level template self-adaptive matching target tracking method for infrared image |
CN113724290B (en) * | 2021-07-22 | 2024-03-05 | 西北工业大学 | Multi-level template self-adaptive matching target tracking method for infrared image |
CN113658222A (en) * | 2021-08-02 | 2021-11-16 | 上海影谱科技有限公司 | Vehicle detection tracking method and device |
CN114066936B (en) * | 2021-11-06 | 2023-09-12 | 中国电子科技集团公司第五十四研究所 | Target reliability tracking method in small target capturing process |
CN114066936A (en) * | 2021-11-06 | 2022-02-18 | 中国电子科技集团公司第五十四研究所 | Target reliability tracking method in small target capturing process |
CN114241008B (en) * | 2021-12-21 | 2023-03-07 | 北京航空航天大学 | Long-time region tracking method adaptive to scene and target change |
CN114241008A (en) * | 2021-12-21 | 2022-03-25 | 北京航空航天大学 | Long-time region tracking method adaptive to scene and target change |
CN116596958B (en) * | 2023-07-18 | 2023-10-10 | 四川迪晟新达类脑智能技术有限公司 | Target tracking method and device based on online sample augmentation |
CN116596958A (en) * | 2023-07-18 | 2023-08-15 | 四川迪晟新达类脑智能技术有限公司 | Target tracking method and device based on online sample augmentation |
CN117292306A (en) * | 2023-11-27 | 2023-12-26 | 四川迪晟新达类脑智能技术有限公司 | Edge equipment-oriented vehicle target detection optimization method and device |
CN117576164A (en) * | 2023-12-14 | 2024-02-20 | 中国人民解放军海军航空大学 | Remote sensing video sea-land movement target tracking method based on feature joint learning |
CN117576164B (en) * | 2023-12-14 | 2024-05-03 | 中国人民解放军海军航空大学 | Remote sensing video sea-land movement target tracking method based on feature joint learning |
Also Published As
Publication number | Publication date |
---|---|
CN111508002B (en) | 2020-12-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111508002B (en) | Small-sized low-flying target visual detection tracking system and method thereof | |
CN109614985B (en) | Target detection method based on densely connected feature pyramid network | |
CN107609525B (en) | Remote sensing image target detection method for constructing convolutional neural network based on pruning strategy | |
US9846946B2 (en) | Objection recognition in a 3D scene | |
CN110287826B (en) | Video target detection method based on attention mechanism | |
CN107633226B (en) | Human body motion tracking feature processing method | |
CN108446634B (en) | Aircraft continuous tracking method based on combination of video analysis and positioning information | |
CN110163213B (en) | Remote sensing image segmentation method based on disparity map and multi-scale depth network model | |
CN113483747B (en) | Improved AMCL (advanced metering library) positioning method based on semantic map with corner information and robot | |
CN111882586B (en) | Multi-actor target tracking method oriented to theater environment | |
CN111476817A (en) | Multi-target pedestrian detection tracking method based on yolov3 | |
CN111738055B (en) | Multi-category text detection system and bill form detection method based on same | |
CN108564598B (en) | Improved online Boosting target tracking method | |
CN111160407A (en) | Deep learning target detection method and system | |
CN111126278A (en) | Target detection model optimization and acceleration method for few-category scene | |
CN111274964B (en) | Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle | |
CN111354022A (en) | Target tracking method and system based on kernel correlation filtering | |
CN112164093A (en) | Automatic person tracking method based on edge features and related filtering | |
CN113129332A (en) | Method and apparatus for performing target object tracking | |
CN110751671B (en) | Target tracking method based on kernel correlation filtering and motion estimation | |
CN102663419A (en) | Pan-tilt tracking method based on representation model and classification model | |
CN112200831A (en) | Dense connection twin neural network target tracking method based on dynamic template | |
CN111275733A (en) | Method for realizing rapid tracking processing of multiple ships based on deep learning target detection technology | |
CN114972711B (en) | Improved weak supervision target detection method based on semantic information candidate frame | |
CN115953431A (en) | Multi-target tracking method and system for aerial video of unmanned aerial vehicle |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |