CN111899283B - Video target tracking method - Google Patents
Video target tracking method Download PDFInfo
- Publication number
- CN111899283B CN111899283B CN202010753190.XA CN202010753190A CN111899283B CN 111899283 B CN111899283 B CN 111899283B CN 202010753190 A CN202010753190 A CN 202010753190A CN 111899283 B CN111899283 B CN 111899283B
- Authority
- CN
- China
- Prior art keywords
- convolution
- correlation
- target
- layer
- tracking
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Abstract
The application provides a video target tracking method, belonging to the field of computer vision. The method comprises the following steps: inputting the target image and the search image into a hierarchy correlation twin network at the same time to perform feature extraction to obtain convolution features extracted by different convolution layers, performing correlation measurement on the convolution features of the target image and the search image extracted by the same convolution layer, and generating hierarchy correlation by hierarchy splicing of the correlation of each layer; taking the position of the search image with the highest tracking response in the hierarchical correlation as the center position of the tracking target in the search image; and determining the position of the tracking target in the search image according to the central position of the tracking target in the search image and the independent scale factors. By adopting the method and the device, any target can be accurately tracked.
Description
Technical Field
The application relates to the field of computer vision, in particular to a video target tracking method.
Background
In recent years, the number of automobiles has been rapidly increased with the improvement of the living standard of people and the great change of the automobile manufacturing industry, but the available road resources are smaller and smaller, and the human self-reaction capability and the perception capability are limited, so that the traffic accident rate has been continuously increased in recent years due to the fact that the information fed back from the outside is wrongly judged. According to incomplete statistics, the number of traffic accident deaths caused by driving automobiles in the world is over 3000 ten thousand, and more than the number of world deaths caused by large combat. With the opportunity of revolutionary changes brought to the automobile manufacturing industry by the internet technology, unmanned vehicles represent a rapid development potential in the current society, and the main purpose of the unmanned vehicles is to separate people from complex driving operations and improve the safety of vehicles running on roads.
However, the unmanned vehicle has a certain difficulty in actually putting unmanned vehicle into practice, and the most critical problem is that the unmanned vehicle cannot accurately judge complex road conditions and obstacle conditions according to the prior experience like a human brain. The video target tracking is used as a key ring in the unmanned vehicle, the target in front of the vehicle is tracked in real time, the dynamic state of the target in front of the vehicle can be mastered, and a basis is provided for the unmanned vehicle to make a correct decision in the current environment, so that various necessary basic operations such as vehicle distance maintenance, lane changing, vehicle speed adjustment and the like can be ensured in the driving process, the performance of the unmanned vehicle is greatly improved, unnecessary accidents are reduced, and the driving safety is improved.
However, the existing video target tracking method has the problems of low tracking accuracy and the like.
Disclosure of Invention
The embodiment of the application provides a video target tracking method, which can improve the accuracy of target tracking. The technical scheme is as follows:
in one aspect, a video object tracking method is provided, the method being applied to an electronic device, the method comprising:
inputting the target image and the search image into a hierarchy correlation twin network at the same time to perform feature extraction to obtain convolution features extracted by different convolution layers, performing correlation measurement on the convolution features of the target image and the search image extracted by the same convolution layer, and generating hierarchy correlation by splicing the correlation of each layer, wherein the target image comprises: tracking a target;
taking the position of the search image with the highest tracking response in the hierarchical correlation as the center position of the tracking target in the search image;
and determining the position of the tracking target in the search image according to the central position of the tracking target in the search image and the independent scale factors.
Further, the step of simultaneously inputting the target image and the search image into the hierarchy correlation twin network to perform feature extraction, and the step of obtaining convolution features extracted by different convolution layers includes:
simultaneously inputting the target image and the search image into two branches of the hierarchical correlation twin network for feature extraction to perform convolution calculation, so as to obtain convolution features extracted by different convolution layers;
each branch structure for extracting the characteristics in the hierarchical correlation twin network is as follows: (conv1+relu+overlay+max POOL) — (conv2+relu+overlay+max POOL) — (conv3+relu) — (conv4+relu) — (conv5+relu);
where conv represents the convolutional layer, reLU represents the nonlinear activation function, overmapping represents the local response normalization layer, and Max POOL represents the maximum pooling layer.
Further, the formula for performing correlation measurement on the convolution characteristics of the target image and the search image extracted by the same convolution layer is as follows:
wherein F (z, x) i Representing a correlation between the target image extracted by the convolution layer i and the convolution characteristics of the search image; z and x represent the target image and the search image, respectively;a convolution characteristic representing the output of the convolution layer i; beta represents the deviation.
Further, the step of using the position of the search image with the highest tracking response in the hierarchical relevance as the tracking target at the center position of the search image comprises the following steps:
inputting the maximum correlation in the hierarchical correlation into a correlation attention module; the structure of the correlation attention module is as follows: full tie layer 1-full tie layer 2-full tie layer 3-full tie layer 4-softmax layer;
the correlation among the convolution characteristics of different layers is learned through four full-connection layers, and corresponding weights are distributed to each convolution layer through a softmax layer;
and determining the tracking response of each layer of convolution layer according to the correlation of each layer of convolution layer and the weight of the corresponding convolution layer obtained by allocation, and taking the position of the search image with the highest tracking response as the center position of the tracking target in the search image.
Further, the highest tracking response is expressed as:
wherein Y (z, x) represents the highest tracking response; z and x represent the target image and the search image, respectively;a convolution characteristic representing the output of the convolution layer i; alpha i A weight assigned to convolution layer i; beta represents the deviation.
Further, the scale factors independent in the width direction are expressed as:
s w (w+p)=A w
the scale factors independent in the height direction are expressed as:
s h (h+p)=A h
wherein s is w Sum s h Representing scale factors of the target in the width direction and the height direction, respectively; w and h represent the width and height of the target, respectively; p represents a filled region; a is that w And A is a h The size of the object in the width direction and the height direction are indicated, respectively.
Further, the filled region p is expressed as:
p=(w+h)/2。
in one aspect, an electronic device is provided that includes a processor and a memory having at least one instruction stored therein that is loaded and executed by the processor to implement the video object tracking method described above.
In one aspect, a computer readable storage medium having stored therein at least one instruction that is loaded and executed by a processor to implement the video object tracking method described above is provided.
The technical scheme provided by the embodiment of the application has the beneficial effects that at least:
1) The hierarchical correlation twin network is provided on the basis of the twin network to track the target, and can comprehensively utilize the characteristic information of a plurality of convolution layers, so that the selection of tracking target positions is increased, and the tracking accuracy of the video target tracking method is improved;
2) The video target tracking method can be adaptively adjusted when tracking different targets through the correlation attention module, different weights can be distributed to the correlation of each layer, the selection of the positions of the tracked targets is further enhanced, and the tracking accuracy is improved;
3) Independent scale factors are used in the width and height directions of the tracking target to output frames (i.e.: the size of the tracking target) can reduce the deformation of the output frame and increase the tracking accuracy;
4) Is more robust to the conditions of complex background and large scale change of the tracking target.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a video target tracking method according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a hierarchical correlation twin network according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a correlation attention module according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.
As shown in fig. 1, an embodiment of the present application provides a video object tracking method, which may be implemented by an electronic device, where the electronic device may be a terminal or a server, and the method includes:
s101, inputting a target image and a search image into a hierarchy correlation twin network at the same time to perform feature extraction to obtain convolution features extracted by different convolution layers, performing correlation measurement on the convolution features of the target image and the search image extracted by the same convolution layer, and generating hierarchy correlation by hierarchical stitching of the correlation of each layer, wherein the target image comprises: tracking a target;
s102, taking the position of the search image with highest tracking response in the hierarchical correlation as the center position of the tracking target in the search image;
s103, determining the position of the tracking target in the search image according to the central position of the tracking target in the search image and the independent scale factors.
According to the video target tracking method, a hierarchical correlation twin network is provided on the basis of the twin network to track the target, and the hierarchical correlation twin network can comprehensively utilize the characteristic information of a plurality of convolution layers, so that the selection of tracking target positions is increased, and the tracking accuracy of the video target tracking method is improved; because independent scale factors are used, the influence on deformation of a tracking target caused by zooming pictures can be reduced, and the accuracy of target tracking is further improved.
In a specific embodiment of the foregoing video object tracking method, further, each branch structure for performing feature extraction in the hierarchical correlation twin network is: (conv1+relu+overlay+max POOL) — (conv2+relu+overlay+max POOL) — (conv3+relu) — (conv4+relu) — (conv5+relu);
where conv represents the convolutional layer, reLU represents the nonlinear activation function, overmapping represents the local response normalization layer, and Max POOL represents the maximum pooling layer.
In this embodiment, the structure for extracting features in the hierarchical correlation twin network includes: five convolutional layers (conv); to prevent the gradient vanishing problem, a ReLU nonlinear activation function is added after each convolution layer.
In this embodiment, after the ReLU nonlinear activation functions of conv1 and conv2, a local response normalization layer is connected to accelerate convergence of the hierarchical correlation twin network, and at the same time, a maximum pooling layer is connected to reduce the size of the feature map after the local response normalization layer.
In this embodiment, since the target tracking task is different from the target detection task, it is not necessary to output the category of the tracking target, and thus, it is not necessary to use a full connection layer when extracting the features.
In this embodiment, as shown in fig. 2, for example, a target image with a size of 127×127 and a search image with a size of 255×255 may be input into a hierarchical correlation twin network at the same time to perform convolution calculation by two branches for feature extraction, so as to obtain convolution features extracted by different convolution layers; then, correlation measurement is carried out on the convolution characteristics of the target image and the search image extracted by the same layer of convolution layers, the correlation of each layer (common five layers) is spliced through the layers to generate the layer correlation, and the layer correlation of 5×17×17 is obtained, wherein 17×17 in the 5×17×17 is the size of the feature map output by each layer of convolution layers, and 5 is the number of convolution layers.
In this embodiment, the hierarchical concatenation may be understood as superposition, like RGB, where different R, G, B values are superimposed to show different colors.
In a specific embodiment of the foregoing video object tracking method, further, a formula for performing a correlation measurement on a convolution feature of a target image and a search image extracted by a same convolution layer is:
wherein F (z, x) i Representing a correlation between the target image extracted by the convolution layer i and the convolution characteristics of the search image; z and x represent the target image and the search image, respectively;a convolution characteristic representing the output of the convolution layer i; beta represents the deviation.
In this embodiment, the correlation metric can measure the difference between the convolution characteristics of the target image and the search image extracted by the same layer of convolution layers located in different branches.
In a specific embodiment of the foregoing video object tracking method, further, the step of using a position of the search image having the highest tracking response in the hierarchical relevance as the tracking object at a center position of the search image includes:
inputting the maximum correlation in the hierarchical correlation into a correlation attention module; the structure of the correlation attention module is as follows: full tie layer 1-full tie layer 2-full tie layer 3-full tie layer 4-softmax layer, as shown in fig. 3;
the correlation among the convolution characteristics of different layers is learned through four full-connection layers, and corresponding weights are distributed to each convolution layer through a softmax layer;
and determining the tracking response of each layer of convolution layer according to the correlation of each layer of convolution layer and the weight of the corresponding convolution layer obtained by allocation, and taking the position of the search image with the highest tracking response as the center position of the tracking target in the search image.
According to the method, the video target tracking method can be adaptively adjusted when tracking different targets through the correlation attention module, different weights can be distributed to the correlation of each layer, the selection of the positions of the tracked targets is further enhanced, and the tracking accuracy is improved.
In a specific embodiment of the foregoing video object tracking method, further, the highest tracking response is expressed as:
wherein Y (z, x) represents the highest tracking response; z and x represent the target image and the search image, respectively;a convolution characteristic representing the output of the convolution layer i; alpha i A weight assigned to convolution layer i; beta represents the deviation.
In a specific embodiment of the foregoing video object tracking method, further, the scale factors independent in the width direction are expressed as:
s w (w+p)=A w
the scale factors independent in the height direction are expressed as:
s h (h+p)=A h
wherein s is w Sum s h Respectively representing scale factors of the tracking target in the width direction and the height direction; w and h represent the width and height of the tracking target on the target image, respectively; p represents a filled region, p= (w+h)/2; a is that w And A is a h Respectively represent widthThe width and height of the target image input in the direction and height direction.
In this embodiment, the width w and the height h of the tracking target on the target image and the width A of the input target image can be used w And height A h Calculating scale factors s of the tracking target in the width direction and the height direction w Sum s h Further according to s w Sum s h The width and the height of the tracked object on the search image are calculated to be the final size of the output frame, so that the independent scale factors are used for adjusting the output frame (namely the size of the tracked object) in the width and the height directions of the tracked object, the deformation of the output frame can be reduced, and the tracking accuracy is increased.
In this embodiment, by using independent scale factors, the transformation of one dimension in the width and height directions will not affect the other dimension, and the tracking target will not be deformed basically, so that the influence on the deformation of the tracking target caused by scaling the target to a uniform size can be reduced.
According to the video target tracking method provided by the embodiment of the application, a hierarchical correlation twin network is provided on the basis of the twin network to track the target, and a correlation attention module and independent scale factors are designed for the hierarchical correlation twin network, so that compared with the video target tracking method based on a pure twin network, the video target tracking method has higher tracking accuracy and is more robust to the conditions of complex background and larger scale change of the tracked target.
Next, the validity of the video target tracking method provided by the embodiment of the application is verified, and the method is specifically:
the framework was deep learned using Python assembly language and TensorFlow.
And using the ILSRVC2015-VID data set as a training data set, randomly selecting two frames of images from one video segment, cutting and scaling to a fixed size, and then inputting the video segment into a network, wherein the interval between the two frames of images is not more than 100 frames.
In order to accelerate the convergence speed of the tracking model (comprising a hierarchical correlation twin network and a correlation attention module) in a training stage, an optimization method is a momentum gradient descent method, an exponentially weighted average gradient is used for replacing an original gradient to update parameters, and the momentum is set to be 0.9.
The iteration batch is set to 8 image pairs.
The initial learning rate was 0.01.
The attenuation coefficient was 0.86.
The training round number is 60, and each round contains 53200 image pairs.
After training, the trained tracking model provided by the application is tested on OTB50, OTB100, VOT2015 and VOT2016 data sets, and the test proves that the tracking model provided by the embodiment improves the tracking accuracy by 6.5% compared with the video target tracking model based on a pure twin network (for example, a full convolution twin network), and can improve the tracking performance of an algorithm under the condition of not reducing the speed.
Fig. 4 is a schematic structural diagram of an electronic device 600 according to an embodiment of the present application, where the electronic device 600 may have a relatively large difference due to different configurations or performances, and may include one or more processors (central processing units, CPU) 601 and one or more memories 602, where at least one instruction is stored in the memories 602, and the at least one instruction is loaded and executed by the processors 601 to implement the video object tracking method described above.
In an exemplary embodiment, a computer readable storage medium, such as a memory comprising instructions executable by a processor in a terminal to perform the video object tracking method described above, is also provided. For example, the computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The foregoing description of the preferred embodiments of the application is not intended to limit the application to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the application are intended to be included within the scope of the application.
Claims (3)
1. A video object tracking method, comprising:
inputting the target image and the search image into a hierarchy correlation twin network at the same time to perform feature extraction to obtain convolution features extracted by different convolution layers, performing correlation measurement on the convolution features of the target image and the search image extracted by the same convolution layer, and generating hierarchy correlation by splicing the correlation of each layer, wherein the target image comprises: tracking a target;
taking the position of the search image with the highest tracking response in the hierarchical correlation as the center position of the tracking target in the search image;
determining the position of the tracking target in the search image according to the central position of the tracking target in the search image and the independent scale factors;
the step of taking the position of the search image with the highest tracking response in the hierarchical relevance as a tracking target at the center position of the search image comprises the following steps:
inputting the maximum correlation in the hierarchical correlation into a correlation attention module; the structure of the correlation attention module is as follows: full tie layer 1-full tie layer 2-full tie layer 3-full tie layer 4-softmax layer;
the correlation among the convolution characteristics of different layers is learned through four full-connection layers, and corresponding weights are distributed to each convolution layer through a softmax layer;
determining the tracking response of each layer of convolution layer according to the correlation of each layer of convolution layer and the weight of the corresponding convolution layer obtained by distribution, and taking the position of the search image with the highest tracking response as the center position of the tracking target in the search image;
wherein the highest tracking response is expressed as:
wherein Y (z, x) represents the highest tracking response; z and x represent the target image and the search image, respectively;a convolution characteristic representing the output of the convolution layer i; alpha i A weight assigned to convolution layer i; beta represents a deviation;
wherein the scale factors independent in the width direction are expressed as:
s w (w+p)=A w
the scale factors independent in the height direction are expressed as:
s h (h+p)=A h
wherein s is w Sum s h Respectively representing scale factors of the tracking target in the width direction and the height direction; w and h represent the width and height of the tracking target on the target image, respectively; p represents a filled region, p= (w+h)/2; a is that w And A is a h Representing the width and height of the target image input in the width direction and the height direction, respectively;
wherein the filled region p is denoted as:
p=(w+h)/2。
2. the method of claim 1, wherein the step of simultaneously inputting the target image and the search image into the hierarchical correlation twin network to perform feature extraction, and obtaining convolution features extracted by different convolution layers comprises:
simultaneously inputting the target image and the search image into two branches of the hierarchical correlation twin network for feature extraction to perform convolution calculation, so as to obtain convolution features extracted by different convolution layers;
each branch structure for extracting the characteristics in the hierarchical correlation twin network is as follows: (conv1+relu+overlay+max POOL) — (conv2+relu+overlay+max POOL) — (conv3+relu) — (conv4+relu) — (conv5+relu);
where conv represents the convolutional layer, reLU represents the nonlinear activation function, overmapping represents the local response normalization layer, and Max POOL represents the maximum pooling layer.
3. The video object tracking method according to claim 1, wherein the formula for performing the correlation measurement on the convolution characteristics of the object image and the search image extracted by the same convolution layer is:
wherein F (z, x) i Representing a correlation between the target image extracted by the convolution layer i and the convolution characteristics of the search image; z and x represent the target image and the search image, respectively;a convolution characteristic representing the output of the convolution layer i; beta represents the deviation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010753190.XA CN111899283B (en) | 2020-07-30 | 2020-07-30 | Video target tracking method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010753190.XA CN111899283B (en) | 2020-07-30 | 2020-07-30 | Video target tracking method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111899283A CN111899283A (en) | 2020-11-06 |
CN111899283B true CN111899283B (en) | 2023-10-17 |
Family
ID=73182806
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010753190.XA Active CN111899283B (en) | 2020-07-30 | 2020-07-30 | Video target tracking method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111899283B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116360492B (en) * | 2023-04-03 | 2024-01-30 | 北京科技大学 | Object tracking method and system for flapping wing flying robot |
CN117809025A (en) * | 2024-03-01 | 2024-04-02 | 深圳魔视智能科技有限公司 | Attention network-based target tracking method, device, equipment and storage medium |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2017156886A (en) * | 2016-02-29 | 2017-09-07 | Kddi株式会社 | Device of tracking object taking similarity degree between images into consideration, program thereof and method thereof |
CN109978921A (en) * | 2019-04-01 | 2019-07-05 | 南京信息工程大学 | A kind of real-time video target tracking algorithm based on multilayer attention mechanism |
CN110021033A (en) * | 2019-02-22 | 2019-07-16 | 广西师范大学 | A kind of method for tracking target based on the twin network of pyramid |
CN110286683A (en) * | 2019-07-15 | 2019-09-27 | 北京科技大学 | A kind of autonomous running path tracking control method of caterpillar mobile robot |
CN110490906A (en) * | 2019-08-20 | 2019-11-22 | 南京邮电大学 | A kind of real-time vision method for tracking target based on twin convolutional network and shot and long term memory network |
CN110675429A (en) * | 2019-09-24 | 2020-01-10 | 湖南人文科技学院 | Long-range and short-range complementary target tracking method based on twin network and related filter |
CN111161317A (en) * | 2019-12-30 | 2020-05-15 | 北京工业大学 | Single-target tracking method based on multiple networks |
CN111192292A (en) * | 2019-12-27 | 2020-05-22 | 深圳大学 | Target tracking method based on attention mechanism and twin network and related equipment |
CN111260688A (en) * | 2020-01-13 | 2020-06-09 | 深圳大学 | Twin double-path target tracking method |
CN111291679A (en) * | 2020-02-06 | 2020-06-16 | 厦门大学 | Target specific response attention target tracking method based on twin network |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10902243B2 (en) * | 2016-10-25 | 2021-01-26 | Deep North, Inc. | Vision based target tracking that distinguishes facial feature targets |
US20180129934A1 (en) * | 2016-11-07 | 2018-05-10 | Qualcomm Incorporated | Enhanced siamese trackers |
US11308350B2 (en) * | 2016-11-07 | 2022-04-19 | Qualcomm Incorporated | Deep cross-correlation learning for object tracking |
US10902615B2 (en) * | 2017-11-13 | 2021-01-26 | Qualcomm Incorporated | Hybrid and self-aware long-term object tracking |
US11055854B2 (en) * | 2018-08-23 | 2021-07-06 | Seoul National University R&Db Foundation | Method and system for real-time target tracking based on deep learning |
US11493908B2 (en) * | 2018-11-13 | 2022-11-08 | Rockwell Automation Technologies, Inc. | Industrial safety monitoring configuration using a digital twin |
-
2020
- 2020-07-30 CN CN202010753190.XA patent/CN111899283B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2017156886A (en) * | 2016-02-29 | 2017-09-07 | Kddi株式会社 | Device of tracking object taking similarity degree between images into consideration, program thereof and method thereof |
CN110021033A (en) * | 2019-02-22 | 2019-07-16 | 广西师范大学 | A kind of method for tracking target based on the twin network of pyramid |
CN109978921A (en) * | 2019-04-01 | 2019-07-05 | 南京信息工程大学 | A kind of real-time video target tracking algorithm based on multilayer attention mechanism |
CN110286683A (en) * | 2019-07-15 | 2019-09-27 | 北京科技大学 | A kind of autonomous running path tracking control method of caterpillar mobile robot |
CN110490906A (en) * | 2019-08-20 | 2019-11-22 | 南京邮电大学 | A kind of real-time vision method for tracking target based on twin convolutional network and shot and long term memory network |
CN110675429A (en) * | 2019-09-24 | 2020-01-10 | 湖南人文科技学院 | Long-range and short-range complementary target tracking method based on twin network and related filter |
CN111192292A (en) * | 2019-12-27 | 2020-05-22 | 深圳大学 | Target tracking method based on attention mechanism and twin network and related equipment |
CN111161317A (en) * | 2019-12-30 | 2020-05-15 | 北京工业大学 | Single-target tracking method based on multiple networks |
CN111260688A (en) * | 2020-01-13 | 2020-06-09 | 深圳大学 | Twin double-path target tracking method |
CN111291679A (en) * | 2020-02-06 | 2020-06-16 | 厦门大学 | Target specific response attention target tracking method based on twin network |
Non-Patent Citations (5)
Title |
---|
Deep Siamese Network for Multiple Object Tracking;Bonan Cuan 等;《2018 IEEE 20th international workshop on multimedia signal processing(MMSP)》;1-6 * |
Feature Deep Continuous Aggregation for 3D Vehicle Detection;Zhao, K 等;《Applied sciences-basel》;第9卷(第24期);1-17 * |
Hierarchical correlation siamese network for real-time object tracking;Yu Meng 等;《Applied Intelligence》;第51卷(第6期);3202-3211 * |
基于可变预测时域及速度的车辆路径跟踪控制;白国星 等;《中国机械工程》;第31卷(第11期);1277-1284 * |
融合多尺度局部特征与深度特征的双目立体匹配;王旭初 等;《光学学报》;第40卷(第2期);119-131 * |
Also Published As
Publication number | Publication date |
---|---|
CN111899283A (en) | 2020-11-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107945204B (en) | Pixel-level image matting method based on generation countermeasure network | |
KR102635987B1 (en) | Method, apparatus, device and storage medium for training an image semantic segmentation network | |
CN112052787B (en) | Target detection method and device based on artificial intelligence and electronic equipment | |
CN107169421B (en) | Automobile driving scene target detection method based on deep convolutional neural network | |
CN111554105B (en) | Intelligent traffic identification and statistics method for complex traffic intersection | |
CN111899283B (en) | Video target tracking method | |
CN111353505B (en) | Device based on network model capable of realizing semantic segmentation and depth of field estimation jointly | |
CN112307978A (en) | Target detection method and device, electronic equipment and readable storage medium | |
CN110689043A (en) | Vehicle fine granularity identification method and device based on multiple attention mechanism | |
CN111126459A (en) | Method and device for identifying fine granularity of vehicle | |
CN112464912B (en) | Robot end face detection method based on YOLO-RGGNet | |
CN115512251A (en) | Unmanned aerial vehicle low-illumination target tracking method based on double-branch progressive feature enhancement | |
CN113269133A (en) | Unmanned aerial vehicle visual angle video semantic segmentation method based on deep learning | |
CN111179272B (en) | Rapid semantic segmentation method for road scene | |
CN114360239A (en) | Traffic prediction method and system for multilayer space-time traffic knowledge map reconstruction | |
CN115661767A (en) | Image front vehicle target identification method based on convolutional neural network | |
CN113177432A (en) | Head pose estimation method, system, device and medium based on multi-scale lightweight network | |
CN114399638A (en) | Semantic segmentation network training method, equipment and medium based on patch learning | |
CN113920479A (en) | Target detection network construction method, target detection device and electronic equipment | |
CN112597996A (en) | Task-driven natural scene-based traffic sign significance detection method | |
CN116776208A (en) | Training method of seismic wave classification model, seismic wave selecting method, equipment and medium | |
CN115689946A (en) | Image restoration method, electronic device and computer program product | |
CN115981302A (en) | Vehicle following lane change behavior decision-making method and device and electronic equipment | |
CN112016599A (en) | Neural network training method and device for image retrieval and electronic equipment | |
CN117456480B (en) | Light vehicle re-identification method based on multi-source information fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |