CN111062973B - Vehicle tracking method based on target feature sensitivity and deep learning - Google Patents
Vehicle tracking method based on target feature sensitivity and deep learning Download PDFInfo
- Publication number
- CN111062973B CN111062973B CN201911408023.5A CN201911408023A CN111062973B CN 111062973 B CN111062973 B CN 111062973B CN 201911408023 A CN201911408023 A CN 201911408023A CN 111062973 B CN111062973 B CN 111062973B
- Authority
- CN
- China
- Prior art keywords
- picture
- target
- tracking
- filtering
- discriminant
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20024—Filtering details
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a vehicle tracking method based on target feature sensitivity and deep learning, and mainly solves the problem that in the prior art, in the vehicle tracking process, due to the fact that shielding, illumination change and the like occur, interferents similar to a vehicle target are easily judged as the vehicle target, and tracking failure is caused. The method comprises the following steps: and constructing and training a discriminant connected network, extracting features through a trained public network model, selecting a filter which is more sensitive to a vehicle target, and tracking the vehicle target by using the discriminant connected network and the selected sensitive filter. The method introduces the selection of the sensitive filter bank and the operation, and has the advantages of strong robustness, good tracking effect, low calculation amount and easy realization.
Description
Technical Field
The invention belongs to the technical field of image processing, and further relates to a vehicle tracking method based on target feature sensitivity and deep learning in the technical field of target tracking. The invention can be used for tracking vehicles which are driven in unmanned driving, driving assistance and intelligent traffic.
Background
The task of vehicle tracking is to predict the size and position of a vehicle in a subsequent frame given the size and position of the vehicle in an initial frame of a video sequence, and tracking based on correlation filtering is a lot of attention due to its real-time nature. And updating a filtering template for training data based on a tracking result of the relevant filtering for tracking the previous frame, and obtaining a response graph by obtaining the correlation between the obtained filtering template and the extracted features of the current frame, wherein the position of the maximum response point on the response graph is the position of the vehicle target. In order to solve the appearance change situation of the target in the tracking process, people design different feature descriptors, such as HOG features, SIFT features and the like. With the rapid development of deep learning in the fields of target detection, image classification and image segmentation, it is a latest trend to apply a deep neural network as a feature extractor to the field of vehicle tracking.
The patent document "a road vehicle tracking method based on multi-feature fusion" (patent application number: 201910793516.9, publication number: CN 110517291A) applied by Nanjing post and telecommunications university discloses a road vehicle tracking method based on multi-feature space fusion. Firstly, reading a section of video and dividing the video into image frames, selecting an area where a vehicle target is located, converting the input image frame from an RGB color space to an HSV color space, and taking a color histogram as a color feature; calculating horizontal edge characteristics, vertical edge characteristics and diagonal edge characteristics by constructing an integral graph to obtain Haar-like shape characteristics; then respectively establishing a target model and a candidate model in a vertical edge feature space, a horizontal edge feature space, a diagonal edge feature space and a color feature space, calculating the similarity between the two models by utilizing a Bhattacharyya coefficient, and iteratively calculating the position of the candidate model which is most similar to the target model in the current frame by using a mean shift algorithm; and respectively finding four possible target positions in the color feature space, the horizontal edge feature space, the vertical edge feature space and the diagonal edge feature space, and performing weighted fusion to obtain the final position of the target. The method has the disadvantages that because the method adopts the Haar-like shape characteristics to describe the appearance characteristics of the vehicle, when illumination change, mutual shielding of the vehicle and motion blurring of the vehicle occur, the Haar-like characteristics easily judge an interfering object similar to a vehicle target as the vehicle target, and the tracking fails. In real-time tracking of vehicles under actual road conditions, the mutual shielding condition between vehicles is very common, so the robustness of the method cannot meet the requirement of vehicle tracking under actual road conditions.
Disclosure of Invention
The invention aims to provide a vehicle tracking method based on target feature sensitivity and deep learning aiming at the defects of the prior art, and the method is used for solving the problem of tracking failure caused by occlusion, illumination change and the like in the vehicle tracking process.
The idea for realizing the purpose of the invention is as follows: and constructing and training a discriminant connected network, extracting features through a trained public network model, selecting a filter which is more sensitive to a vehicle target, and tracking the vehicle target by using the discriminant connected network and the selected sensitive filter.
The method comprises the following specific steps:
two identical sub-networks are built, and each sub-network has five layers of structures which are as follows in sequence: first convolution layer → first downsampling layer → second convolution layer → second downsampling layer → third convolution layer; setting the number of convolution kernels of the first convolution layer, the second convolution layer and the third convolution layer as 16, 32 and 1 in sequence, and setting the sizes of the convolution kernels as 3 x 3, 3 x 3 and 1 x 1 in sequence; setting the filter sizes of the first and second downsampling layers to be 2 x 2;
two sub-networks are arranged in parallel up and down and then connected with a cross-correlation layer XCorr to form a discriminant type connected network; setting a loss function of the discriminant connected network as a contrast loss function;
step 2, generating a training set:
randomly collecting at least 1000 pictures from a continuous video, wherein each picture comprises at least one target and marks the target; cutting the marked target in the picture into a 127 x 127 picture, and randomly cutting the background in the picture into a 127 x 127 picture;
combining the cut target picture and the cut background picture into a picture pair at random, wherein each picture pair at least comprises one target picture; if two pictures in the picture pair are the same target, setting the label of the picture pair to be 1; if the two pictures in the image pair are two different target pictures or a target picture and a background picture, setting the label of the picture pair to be 0; all the picture pairs and the labels thereof form a training set;
step 3, training a discriminant connected network:
inputting the training set into a discriminant connected network, and iteratively updating the network weight by using an Adam optimization algorithm until a comparison loss function is converged to obtain the trained discriminant connected network;
step 4, calculating a filtering template:
firstly, a rectangular frame is formed in a first frame of a tracking video and clings to the periphery of a tracked vehicle target, all pixel points in the range of the rectangular frame are extracted to form a real target picture, and all pixel points in the rectangular frame with the central point of the rectangular frame as the center and with the width and the height expanded by two times respectively form an initial filtering sample picture;
secondly, generating initial filter labels corresponding to each pixel point in the initial filter sample picture one by using a filter label generation formula, and forming the initial filter labels of all the pixel points into a label picture;
inputting the initial filtering sample picture into a trained public network model, outputting two-dimensional sub-feature matrixes with the same number as the last layer of filter of the model, and summing elements at the same positions in all the two-dimensional sub-feature matrixes to obtain a two-dimensional deep layer feature matrix of the initial filtering sample picture;
fourthly, generating a filtering template by using a filtering template calculation formula and the two-dimensional deep layer feature matrix of the tag picture and the initial filtering sample picture;
and 5, determining a sensitive filter combination:
firstly, performing related filtering operation by using each two-dimensional sub-feature matrix in an initial filtering picture and a filtering template to obtain response graphs with the same number as that of filters;
the second step, comparing the magnitude of each response point value in each response map and determining the maximum response point of each response map;
thirdly, the distance between the maximum response point of each response image and the central point of the label image is calculated, and filters corresponding to the first 100 distance values are found out according to the sequence from small to large to form a sensitive filter combination;
step 6, setting a first frame of the tracking video as a current frame;
step 7, positioning a tracked vehicle target in the next frame image of the current frame;
step 8, generating a target picture to be evaluated:
taking the positioned position as a center in the next frame of the current frame, extracting all pixel points in the area with the same size as the real target picture generated in the first step of the step 4 to form a target picture to be evaluated;
step 9, inputting the real target picture and the target picture to be evaluated into the discriminant connected network trained in the step 3, judging whether the output of the discriminant connected network is 1, if so, setting the next frame of the current frame as the current frame, and then executing the step 11; otherwise, the tracking is regarded as failed, and the step 10 is executed;
step 10, repositioning the tracking target:
inputting the next frame of the current frame into a common detector to output the position of the vehicle target to be tracked, taking the output target position as the position of the tracked vehicle target in the next frame of the current frame, and executing the step 11 after setting the next frame of the current frame as the current frame;
step 11, judging whether the current frame is the last frame of the tracking video, if so, executing step 12, otherwise, executing step 7;
and step 12, finishing the vehicle tracking process.
Compared with the prior art, the invention has the following advantages:
firstly, the invention selects the filter which is more sensitive to the vehicle target, and can accurately extract the characteristics of the tracked vehicle target when similar interference occurs, thereby overcoming the problem that the interference similar to the vehicle target is easily judged as the vehicle target when illumination change, mutual vehicle shielding and vehicle motion blurring occur in the prior art, and having the advantages of low calculated amount and strong robustness.
Secondly, the method can evaluate the tracking result by constructing and training the discriminant connection network, can relocate the vehicle target after the tracking fails, and overcomes the problem that the tracking is difficult to continue after the tracking fails in the prior art, so that the method has the advantage of high tracking accuracy.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a filter label diagram of the present invention;
fig. 3 is a structure diagram of a discriminant connectivity network constructed by the present invention.
Detailed Description
The technical solution and effects of the present invention will be further described in detail with reference to the accompanying drawings.
The specific implementation steps of the present invention are further described in detail with reference to fig. 1.
Two identical sub-networks are built, each sub-network has five layers, and the structure of the five sub-networks sequentially comprises the following parts from left to right: first convolution layer → first downsampling layer → second convolution layer → second downsampling layer → third convolution layer; setting the number of convolution kernels of the first convolution layer, the second convolution layer and the third convolution layer as 16, 32 and 1 in sequence, and setting the sizes of the convolution kernels as 3 x 3, 3 x 3 and 1 x 1 in sequence; the filter sizes of the first and second downsampling layers are set to 2 × 2.
Two sub-networks are arranged in parallel up and down and then connected with a cross-correlation layer XCorr to form a discriminant type connected network; and setting the loss function of the discriminant connected network as a contrast loss function.
The discriminant connectivity network constructed in accordance with the present invention is further described with reference to fig. 2.
The upper and lower layers in fig. 2 represent two sub-networks respectively, each layer of each sub-network is sequentially a first convolutional layer, a first lower sampling layer, a second convolutional layer, a second lower sampling layer and a third convolutional layer from left to right with reference to fig. 2, and the two sub-networks are connected in parallel and then connected with a cross-correlation layer XCorr.
And 2, generating a training set.
Randomly collecting at least 1000 pictures from a continuous video, wherein each picture comprises at least one target and marks the target; the labeled objects in the picture are cut out into 127 × 127 pictures, and the background in the pictures is randomly cut out into 127 × 127 pictures.
Combining the cut target picture and the cut background picture into a picture pair at random, wherein each picture pair at least comprises one target picture; if two pictures in the picture pair are the same target, setting the label of the picture pair to be 1; if the two pictures in the image pair are two different target pictures or a target picture and a background picture, setting the label of the picture pair to be 0; and (4) forming a training set by all the picture pairs and the labels thereof.
And 3, training the discriminant connection network.
And inputting the training set into the discriminant connected network, and iteratively updating the network weight by using an Adam optimization algorithm until the comparison loss function is converged to obtain the trained discriminant connected network.
And 4, calculating a filtering template.
And 2, generating initial filter labels corresponding to each pixel point in the initial filter sample picture one by using the following filter label generation formula, and forming the initial filter labels of all the pixel points into a label picture:
wherein g (x, y) represents an initial filtering label corresponding to a pixel point at (x, y) in the filtering sample, pi represents a circumferential rate, sigma represents a control parameter with a value of 0.5, e represents an exponential operation with a natural constant as a base, and x (x, y) represents an initial filtering label corresponding to a pixel point at (x, y) in the filtering samplecAbscissa value, y, representing the central pixel of the initial filtered sample pictureCAnd the ordinate value of the central pixel point of the initial filtering sample picture is represented.
The label picture generated by the present invention will be further described with reference to fig. 3.
The size of fig. 3 is the same as the size of the initial filtered sample picture, and the white dots in the center of fig. 3 represent the locations of tracked vehicle targets in the initial filtered sample picture.
And 3, inputting the initial filtering sample picture into a trained public network model, outputting two-dimensional sub-feature matrixes with the same number as the last layer of filter of the model, and summing elements at the same positions in all the two-dimensional sub-feature matrixes to obtain a two-dimensional deep layer feature matrix of the initial filtering sample picture.
And 4, generating a filtering template by using the following filtering template calculation formula and the two-dimensional deep layer feature matrix of the tag picture and the initial filtering sample picture.
Wherein, F (·) represents a fourier transform operation, h represents a filtering template, x represents a conjugate transpose operation, g represents a label picture, and F represents a two-dimensional deep feature matrix of an initial filtering sample picture.
And 5, determining the sensitive filter combination.
And step 1, performing related filtering operation by using each two-dimensional sub-feature matrix in the initial filtering picture and a filtering template to obtain response graphs with the same number as that of the filters.
And step 2, comparing the magnitude of each response point value in each response map and determining the maximum response point of each response map.
And 3, solving the distance between the maximum response point of each response image and the central point of the label image, and finding out the filters corresponding to the first 100 distance values according to the sequence from small to large to form a sensitive filter combination.
And 6, setting the first frame of the tracking video as the current frame.
And 7, positioning the tracked vehicle target in the next frame image of the current frame.
And 2, extracting all pixel points in the search area range from the next frame image of the current frame of the tracking video to form a search area picture, inputting the search area picture into a public network model, and summing the sensitive sub-features extracted by each filter in the sensitive filter combination determined in the step 5 to obtain the sensitive features of the search area picture.
And 3, performing relevant filtering operation on the sensitive features and the filtering template to obtain a sensitive response graph.
And 4, comparing the magnitude of each response point value in the sensitive response map, determining a maximum response point, and taking the position of the maximum response point as the position of the tracked vehicle target in the next frame of image.
And 8, generating a target picture to be evaluated.
And 4, taking the positioned position as a center in the next frame of the current frame, and extracting all pixel points in the area with the same size as the real target picture generated in the first step of the step 4 to form the target picture to be evaluated.
Step 9, inputting the real target picture and the target picture to be evaluated into the discriminant connected network trained in the step 3, judging whether the output of the discriminant connected network is 1, if so, setting the next frame of the current frame as the current frame, and then executing the step 11; otherwise, the tracking is considered to be failed, and step 10 is executed.
And 10, repositioning the tracking target.
Inputting the next frame of the current frame into a common detector to output the position of the vehicle target to be tracked, taking the output target position as the position of the tracked vehicle target in the next frame of the current frame, and executing the step 11 after setting the next frame of the current frame as the current frame.
And 11, judging whether the current frame is the last frame of the tracking video, if so, executing a step 12, otherwise, executing a step 7.
And step 12, finishing the vehicle tracking process.
The effect of the present invention will be further described with reference to simulation experiments.
1. Simulation conditions are as follows:
the simulation of the invention is carried out on an Ubuntu14.04 system with a CPU of Intel (R) core (TM) i8, a main frequency of 3.5GHz and a memory of 128G by using MATLAB R2014 software and a MatConvnet deep learning toolkit.
2. Simulation content and result analysis:
vehicle tracking in simulation experiment data is simulated by using the method and three methods (a nuclear correlation filtering algorithm is abbreviated as KCF, a full convolution connected network algorithm for tracking is abbreviated as Sim _ FC, and a hierarchical convolution characteristic for tracking is abbreviated as HCFT) in the prior art respectively.
In the simulation experiment, three prior arts are adopted:
the prior art kernel Correlation filter algorithm KCF refers to a target Tracking algorithm, KCF algorithm for short, proposed by Henriques et al in "High-Speed Tracking with Kernelized Correlation Filters [ J ]. IEEE Transactions on Pattern Analysis & Machine integration, 37(3): 583-.
The prior art full convolution integral network algorithm Siam _ FC for tracking means that the algorithm, Bertinetto L et al, in "Bertinetto L, Valmadre J, Henriques, F,et al.Fully-Convolutional Siamese Networks for Object Tracking[J]2016, "sets forth a real-time target algorithm, referred to as the Siam _ FC algorithm for short.
The hierarchical Convolutional network HCFT used for Tracking in the prior art means that Zhang H et al put forward a target Tracking algorithm, HCFT algorithm for short, in Ma C, Huang JB, Yang X, et al.
The simulation experiment data used by the invention are a common tracking database OTB and TColor-128, wherein the OTB database comprises 100 video sequences, and the TColor-128 comprises 128 video sequences. And (4) evaluating the tracking results of the four methods by using two evaluation indexes (distance accuracy DP and overlapping success rate OP). The distance accuracy DP and the overlapping success ratio OP of all videos in the two databases are calculated by using the following formulas, and the average distance accuracy and the average overlapping power of the OTB database and the TColor-128 database are drawn as table 1 and table 2:
the effect of the present invention is further described below with reference to the simulation diagrams of tables 1 and 2.
TABLE 1 OTB database distance accuracy and overlay success rate comparison chart
TABLE 2 TColor-128 database distance accuracy and overlay success rate comparison chart
It can be seen from tables 1 and 2 that the present invention achieves better results in both distance accuracy and overlay success rate on OTB100 and TColor-128 databases, and can achieve better tracking effect, mainly because the present invention can obtain features that can better describe the tracked vehicle target through the combination of sensitive filters, and relocate the tracked target after the tracking fails, thereby obtaining higher and more robust tracking effect.
Claims (5)
1. A vehicle tracking method based on target feature sensitivity and deep learning is characterized in that a discriminant connected network is constructed and trained, features of a tracked vehicle are extracted through a trained public network model, a filter which is more sensitive to a vehicle target is selected from the trained public network model, and the vehicle target is tracked by using the discriminant connected network and the selected sensitive filter, wherein the method specifically comprises the following steps:
step 1, constructing a discriminant type connected network:
two identical sub-networks are built, and each sub-network has five layers of structures which are as follows in sequence: first convolution layer → first downsampling layer → second convolution layer → second downsampling layer → third convolution layer; setting the number of convolution kernels of the first convolution layer, the second convolution layer and the third convolution layer as 16, 32 and 1 in sequence, and setting the sizes of the convolution kernels as 3 x 3, 3 x 3 and 1 x 1 in sequence; setting the filter sizes of the first and second downsampling layers to be 2 x 2;
two sub-networks are arranged in parallel up and down and then connected with a cross-correlation layer XCorr to form a discriminant type connected network; setting a loss function of the discriminant connected network as a contrast loss function;
step 2, generating a training set:
randomly collecting at least 1000 pictures from a continuous video, wherein each picture comprises at least one target and marks the target; cutting the marked target in the picture into a 127 x 127 picture, and randomly cutting the background in the picture into a 127 x 127 picture;
combining the cut target picture and the cut background picture into a picture pair at random, wherein each picture pair at least comprises one target picture; if two pictures in the picture pair are the same target, setting the label of the picture pair to be 1; if the two pictures in the image pair are two different target pictures or a target picture and a background picture, setting the label of the picture pair to be 0; all the picture pairs and the labels thereof form a training set;
step 3, training a discriminant connected network:
inputting the training set into a discriminant connected network, and iteratively updating the network weight by using an Adam optimization algorithm until a comparison loss function is converged to obtain the trained discriminant connected network;
step 4, calculating a filtering template:
firstly, a rectangular frame is formed in a first frame of a tracking video and clings to the periphery of a tracked vehicle target, all pixel points in the range of the rectangular frame are extracted to form a real target picture, and all pixel points in the rectangular frame with the central point of the rectangular frame as the center and with the width and the height expanded by two times respectively form an initial filtering sample picture;
secondly, generating initial filter labels corresponding to each pixel point in the initial filter sample picture one by using a filter label generation formula, and forming the initial filter labels of all the pixel points into a label picture;
inputting the initial filtering sample picture into a trained public network model, outputting two-dimensional sub-feature matrixes with the same number as the last layer of filter of the model, and summing elements at the same positions in all the two-dimensional sub-feature matrixes to obtain a two-dimensional deep layer feature matrix of the initial filtering sample picture;
fourthly, generating a filtering template by using a filtering template calculation formula and the two-dimensional deep layer feature matrix of the tag picture and the initial filtering sample picture;
and 5, determining a sensitive filter combination:
firstly, performing related filtering operation by using each two-dimensional sub-feature matrix in an initial filtering picture and a filtering template to obtain response graphs with the same number as that of filters;
the second step, comparing the magnitude of each response point value in each response map and determining the maximum response point of each response map;
thirdly, the distance between the maximum response point of each response image and the central point of the label image is calculated, and filters corresponding to the first 100 distance values are found out according to the sequence from small to large to form a sensitive filter combination;
step 6, setting a first frame of the tracking video as a current frame;
step 7, positioning a tracked vehicle target in the next frame image of the current frame;
step 8, generating a target picture to be evaluated:
taking the positioned position as a center in the next frame of the current frame, extracting all pixel points in the area with the same size as the real target picture generated in the first step of the step 4 to form a target picture to be evaluated;
step 9, inputting the real target picture and the target picture to be evaluated into the discriminant connected network trained in the step 3, judging whether the output of the discriminant connected network is 1, if so, setting the next frame of the current frame as the current frame, and then executing the step 11; otherwise, the tracking is regarded as failed, and the step 10 is executed;
step 10, repositioning the tracking target:
inputting the next frame of the current frame into a common detector to output the position of the vehicle target to be tracked, taking the output target position as the position of the tracked vehicle target in the next frame of the current frame, and executing the step 11 after setting the next frame of the current frame as the current frame;
step 11, judging whether the current frame is the last frame of the tracking video, if so, executing step 12, otherwise, executing step 7;
and step 12, finishing the vehicle tracking process.
2. The target feature sensitivity and deep learning based vehicle tracking method of claim 1, characterized in that: step 4 the second step of the filter tag generation formula is as follows:
wherein g (x, y) represents an initial filtering label corresponding to a pixel point at (x, y) in the filtering sample, pi represents a circumferential rate, sigma represents a control parameter with a value of 0.5, e represents an exponential operation with a natural constant as a base, and x (x, y) represents an initial filtering label corresponding to a pixel point at (x, y) in the filtering samplecAbscissa value, y, representing the central pixel of the initial filtered sample pictureCAnd the ordinate value of the central pixel point of the initial filtering sample picture is represented.
3. The target feature sensitivity and deep learning based vehicle tracking method of claim 1, characterized in that: step 4, the trained public network model in the third step refers to a public database which has at least 19 layers of depth and is trained by adopting pictures with the scale larger than one hundred thousand.
4. The target feature sensitivity and deep learning based vehicle tracking method of claim 1, characterized in that: step 4, the fourth step is that the calculation formula of the filtering template is as follows:
wherein, F (·) represents a fourier transform operation, h represents a filtering template, x represents a conjugate transpose operation, g represents a label picture, and F represents a two-dimensional deep feature matrix of an initial filtering sample picture.
5. The target feature sensitivity and deep learning based vehicle tracking method of claim 1, characterized in that: the specific steps of positioning the tracked vehicle target in the next frame image of the current frame in step 7 are as follows:
reading the position and the size of a tracked vehicle target in a current frame of a tracking video, and obtaining a search area range by taking the central point position of the vehicle target as a center and expanding the width and the height by two times respectively;
secondly, extracting all pixel points in a search area range from a next frame image of a current frame of the tracking video to form a search area picture, inputting the search area picture into a public network model, and summing sensitive sub-features extracted by each filter in the sensitive filter combination determined in the step 5 to obtain sensitive features of the search area picture;
thirdly, performing relevant filtering operation on the sensitive features and a filtering template to obtain a sensitive response graph;
and fourthly, comparing the magnitude of each response point value in the sensitive response graph, determining a maximum response point, and taking the position of the maximum response point as the position of the tracked vehicle target in the next frame of image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911408023.5A CN111062973B (en) | 2019-12-31 | 2019-12-31 | Vehicle tracking method based on target feature sensitivity and deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911408023.5A CN111062973B (en) | 2019-12-31 | 2019-12-31 | Vehicle tracking method based on target feature sensitivity and deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111062973A CN111062973A (en) | 2020-04-24 |
CN111062973B true CN111062973B (en) | 2021-01-01 |
Family
ID=70305372
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911408023.5A Active CN111062973B (en) | 2019-12-31 | 2019-12-31 | Vehicle tracking method based on target feature sensitivity and deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111062973B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11790664B2 (en) | 2019-02-19 | 2023-10-17 | Tesla, Inc. | Estimating object properties using visual image data |
US11797304B2 (en) | 2018-02-01 | 2023-10-24 | Tesla, Inc. | Instruction set architecture for a vector computational unit |
US11816585B2 (en) | 2018-12-03 | 2023-11-14 | Tesla, Inc. | Machine learning models operating at different frequencies for autonomous vehicles |
US11841434B2 (en) | 2018-07-20 | 2023-12-12 | Tesla, Inc. | Annotation cross-labeling for autonomous control systems |
US11893393B2 (en) | 2017-07-24 | 2024-02-06 | Tesla, Inc. | Computational array microprocessor system with hardware arbiter managing memory requests |
US11893774B2 (en) | 2018-10-11 | 2024-02-06 | Tesla, Inc. | Systems and methods for training machine models with augmented data |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018176000A1 (en) | 2017-03-23 | 2018-09-27 | DeepScale, Inc. | Data synthesis for autonomous control systems |
US11409692B2 (en) | 2017-07-24 | 2022-08-09 | Tesla, Inc. | Vector computational unit |
US10671349B2 (en) | 2017-07-24 | 2020-06-02 | Tesla, Inc. | Accelerated mathematical engine |
US11157441B2 (en) | 2017-07-24 | 2021-10-26 | Tesla, Inc. | Computational array microprocessor system using non-consecutive data formatting |
US11215999B2 (en) | 2018-06-20 | 2022-01-04 | Tesla, Inc. | Data pipeline and deep learning system for autonomous driving |
US11636333B2 (en) | 2018-07-26 | 2023-04-25 | Tesla, Inc. | Optimizing neural network structures for embedded systems |
US11562231B2 (en) | 2018-09-03 | 2023-01-24 | Tesla, Inc. | Neural networks for embedded devices |
US11196678B2 (en) | 2018-10-25 | 2021-12-07 | Tesla, Inc. | QOS manager for system on a chip communications |
US11537811B2 (en) | 2018-12-04 | 2022-12-27 | Tesla, Inc. | Enhanced object detection for autonomous vehicles based on field view |
US11610117B2 (en) | 2018-12-27 | 2023-03-21 | Tesla, Inc. | System and method for adapting a neural network model on a hardware platform |
US11150664B2 (en) | 2019-02-01 | 2021-10-19 | Tesla, Inc. | Predicting three-dimensional features for autonomous driving |
US10997461B2 (en) | 2019-02-01 | 2021-05-04 | Tesla, Inc. | Generating ground truth for machine learning from time series elements |
US11567514B2 (en) | 2019-02-11 | 2023-01-31 | Tesla, Inc. | Autonomous and user controlled vehicle summon to a target |
CN113344932B (en) * | 2021-06-01 | 2022-05-03 | 电子科技大学 | Semi-supervised single-target video segmentation method |
CN113920250B (en) * | 2021-10-21 | 2023-05-23 | 广东三维家信息科技有限公司 | House type code matching method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108171112A (en) * | 2017-12-01 | 2018-06-15 | 西安电子科技大学 | Vehicle identification and tracking based on convolutional neural networks |
CN108280808A (en) * | 2017-12-15 | 2018-07-13 | 西安电子科技大学 | The method for tracking target of correlation filter is exported based on structuring |
CN110210551A (en) * | 2019-05-28 | 2019-09-06 | 北京工业大学 | A kind of visual target tracking method based on adaptive main body sensitivity |
CN110473231A (en) * | 2019-08-20 | 2019-11-19 | 南京航空航天大学 | A kind of method for tracking target of the twin full convolutional network with anticipation formula study more new strategy |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190370553A1 (en) * | 2018-05-09 | 2019-12-05 | Wizr Llc | Filtering of false positives using an object size model |
-
2019
- 2019-12-31 CN CN201911408023.5A patent/CN111062973B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108171112A (en) * | 2017-12-01 | 2018-06-15 | 西安电子科技大学 | Vehicle identification and tracking based on convolutional neural networks |
CN108280808A (en) * | 2017-12-15 | 2018-07-13 | 西安电子科技大学 | The method for tracking target of correlation filter is exported based on structuring |
CN110210551A (en) * | 2019-05-28 | 2019-09-06 | 北京工业大学 | A kind of visual target tracking method based on adaptive main body sensitivity |
CN110473231A (en) * | 2019-08-20 | 2019-11-19 | 南京航空航天大学 | A kind of method for tracking target of the twin full convolutional network with anticipation formula study more new strategy |
Non-Patent Citations (2)
Title |
---|
Deep visual tracking:review and experimental comparison;Peixia Li,et al.;《Pattern Recognition》;20181231;323-338 * |
基于深度学习的车辆检测研究;林晓翠;《万方学位全文数据库》;20170621;全文 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11893393B2 (en) | 2017-07-24 | 2024-02-06 | Tesla, Inc. | Computational array microprocessor system with hardware arbiter managing memory requests |
US11797304B2 (en) | 2018-02-01 | 2023-10-24 | Tesla, Inc. | Instruction set architecture for a vector computational unit |
US11841434B2 (en) | 2018-07-20 | 2023-12-12 | Tesla, Inc. | Annotation cross-labeling for autonomous control systems |
US11893774B2 (en) | 2018-10-11 | 2024-02-06 | Tesla, Inc. | Systems and methods for training machine models with augmented data |
US11816585B2 (en) | 2018-12-03 | 2023-11-14 | Tesla, Inc. | Machine learning models operating at different frequencies for autonomous vehicles |
US11790664B2 (en) | 2019-02-19 | 2023-10-17 | Tesla, Inc. | Estimating object properties using visual image data |
Also Published As
Publication number | Publication date |
---|---|
CN111062973A (en) | 2020-04-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111062973B (en) | Vehicle tracking method based on target feature sensitivity and deep learning | |
Wang et al. | Adaptive DropBlock-enhanced generative adversarial networks for hyperspectral image classification | |
CN108665481B (en) | Self-adaptive anti-blocking infrared target tracking method based on multi-layer depth feature fusion | |
CN108416266B (en) | Method for rapidly identifying video behaviors by extracting moving object through optical flow | |
CN110084249A (en) | The image significance detection method paid attention to based on pyramid feature | |
CN108304873A (en) | Object detection method based on high-resolution optical satellite remote-sensing image and its system | |
CN108346159A (en) | A kind of visual target tracking method based on tracking-study-detection | |
CN104835175B (en) | Object detection method in a kind of nuclear environment of view-based access control model attention mechanism | |
CN106780485A (en) | SAR image change detection based on super-pixel segmentation and feature learning | |
CN107748873A (en) | A kind of multimodal method for tracking target for merging background information | |
CN102495998B (en) | Static object detection method based on visual selective attention computation module | |
CN106991396A (en) | A kind of target relay track algorithm based on wisdom street lamp companion | |
CN108876776B (en) | Classification model generation method, fundus image classification method and device | |
CN109886267A (en) | A kind of soft image conspicuousness detection method based on optimal feature selection | |
CN115661754B (en) | Pedestrian re-recognition method based on dimension fusion attention | |
CN112329771B (en) | Deep learning-based building material sample identification method | |
CN110991547A (en) | Image significance detection method based on multi-feature optimal fusion | |
CN105894037A (en) | Whole supervision and classification method of remote sensing images extracted based on SIFT training samples | |
CN109635726A (en) | A kind of landslide identification method based on the symmetrical multiple dimensioned pond of depth network integration | |
CN104537381A (en) | Blurred image identification method based on blurred invariant feature | |
Yun et al. | Part-level convolutional neural networks for pedestrian detection using saliency and boundary box alignment | |
CN106407975A (en) | Multi-dimensional layered object detection method based on space-spectrum constraint | |
CN111832508B (en) | DIE _ GA-based low-illumination target detection method | |
CN111127407B (en) | Fourier transform-based style migration forged image detection device and method | |
CN117437691A (en) | Real-time multi-person abnormal behavior identification method and system based on lightweight network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |