CN108537825A - A kind of method for tracking target based on transfer learning Recurrent networks - Google Patents
A kind of method for tracking target based on transfer learning Recurrent networks Download PDFInfo
- Publication number
- CN108537825A CN108537825A CN201810250785.6A CN201810250785A CN108537825A CN 108537825 A CN108537825 A CN 108537825A CN 201810250785 A CN201810250785 A CN 201810250785A CN 108537825 A CN108537825 A CN 108537825A
- Authority
- CN
- China
- Prior art keywords
- target
- image block
- image
- network
- abscissa
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 230000000306 recurrent effect Effects 0.000 title claims abstract description 23
- 238000013526 transfer learning Methods 0.000 title claims abstract description 9
- 238000012549 training Methods 0.000 claims abstract description 38
- 241001269238 Data Species 0.000 claims abstract description 9
- 238000012545 processing Methods 0.000 claims abstract description 8
- 238000000605 extraction Methods 0.000 claims abstract description 4
- 230000008569 process Effects 0.000 claims description 8
- 239000011159 matrix material Substances 0.000 claims description 6
- 239000000284 extract Substances 0.000 claims description 5
- 238000011478 gradient descent method Methods 0.000 claims description 5
- 238000009825 accumulation Methods 0.000 claims description 4
- 230000003993 interaction Effects 0.000 claims description 4
- 238000013508 migration Methods 0.000 claims description 4
- 230000005012 migration Effects 0.000 claims description 4
- 238000007619 statistical method Methods 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 2
- 150000001875 compounds Chemical class 0.000 claims description 2
- 239000000203 mixture Substances 0.000 claims description 2
- 238000013519 translation Methods 0.000 claims description 2
- 238000012163 sequencing technique Methods 0.000 claims 1
- 238000013135 deep learning Methods 0.000 description 10
- 238000004458 analytical method Methods 0.000 description 7
- 238000011160 research Methods 0.000 description 5
- 230000009466 transformation Effects 0.000 description 5
- 230000000007 visual effect Effects 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 230000001360 synchronised effect Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000000513 principal component analysis Methods 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/248—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20021—Dividing image into blocks, subimages or windows
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Abstract
The present invention provides a kind of method for tracking target based on transfer learning Recurrent networks, are related to technical field of computer vision.The target object to be tracked is selected and determined from initial pictures;Target location Recurrent networks structure based on block prediction;Training dataset towards tracking generates and network training;Image inputs, and under real-time disposition, extraction acquires by camera and be stored in the video image of memory block, as will be into the input picture of line trace;Target positions, and will obtain image and is input in the Recurrent networks of position, after the processing of network forward direction, network output layer will obtain 8 × 8 × 8 station-keeping datas.Network update calculates 8 × 88 × 8 × 8 relative positions between image block and target marked off by entire image, and constitute one group of training data together with current input image according to obtained target location.
Description
Technical field
The present invention relates to computer vision, computer graphic image, machine intelligence and systems technology fields.
Background technology
Visual target tracking is the important subject of computer vision field, and main task is that acquisition target is continuous
The information such as position, appearance and movement, and then provide base for further semantic layer analysis (such as Activity recognition, scene understanding)
Plinth.Target following research is widely used in the fields such as intelligent monitoring, human-computer interaction, automatic control system, has very strong reality
With value.Currently, method for tracking target includes mainly classical method for tracking target and deep learning method for tracking target.
Classical method for tracking target is broadly divided into production method (Generative Methods) and discriminate method
(Discriminative Methods) two classes.Production method assume target can by certain generating process or model into
Row expression, such as principal component analysis (PCA), then tracking problem is considered as interested by sparse coding (Sparse Coding) etc.
Region in find most probable candidate item.These methods are intended to design a kind of graphical representation side conducive to robust target following
Method.Different from production method, tracking is considered as a classification or a kind of continuous object detection problem by discriminate method,
Task is to distinguish target from image background.Such methods utilize target and background information simultaneously, are mainly to grind at present
A kind of method studied carefully.Discriminate method generally comprises two main steps, the first step be by selection can discrimination objective and
The visual signature of background trains to obtain a grader and its decision rule, and second step is to use the grader during tracking
In the target location for each position in visual field evaluate and determination is most possible.Target frame is then moved to the position
It sets and repeats such process, and then realize tracking, which be used to design various forms of track algorithms.It is overall next
It sees, the main advantage of classical tracking is the speed of service and the dependence less to auxiliary data, while they are also required to
Tradeoff is made between the accuracy and real-time of tracking.
Deep learning (Deep Learning) is the hot spot of the research of machine learning in recent years, due to its powerful mark sheet
The data set and hardware supported of Danone power and continuous development, deep learning achieve surprising success in many aspects, such as
Speech recognition, image recognition, target detection, visual classification etc..Deep learning target following research and development is also very rapid, but by
The requirement of the shortage and real-time of priori in target following so that based on needing a large amount of training datas and parameter to calculate
Depth learning technology be difficult to adequately be put to good use in this respect, have prodigious exploration space.From current achievement in research
From the point of view of, deep learning tracking mainly applies self-encoding encoder network and convolutional neural networks, and there are mainly two types of think for research
Road, one is carrying out transfer learning to network to carry out on-line fine again, another kind be the structure of transformation depth network with adapt to
The requirement of track.Self-encoding encoder network (AE) is typical non-supervisory deep learning network, because of its feature learning ability and antinoise
Performance is applied first in target following.In general, self-encoding encoder network is relatively more intuitive and the scale of construction is moderate, is a kind of outstanding
Non-supervisory deep learning model, be able to apply and achieve preferable effect in the track at first.Not with self-encoding encoder network
Together, convolutional neural networks (CNN) are a kind of feedforward neural networks of supervision type, its convolution comprising the progress of multiple cycle alternations,
Nonlinear transformation and down-sampled operation, embody very powerful performance in pattern-recognition especially Computer Vision Task.
All in all, deep learning has more powerful feature representation ability compared to classical way, related training in tracking
The selection of collection, the selection of network and the improvement of structure, real-time of algorithm, and application recurrent neural network etc. there is still a need for
Further research.
Invention content
The object of the present invention is to provide a kind of method for tracking target based on transfer learning Recurrent networks, deep neural networks
Training data and the problem of target position inaccurate when for tracking.
The purpose of the present invention is achieved through the following technical solutions:This method including the use of VGG-19 networks, including
Following steps:
(1) Object selection
The target object to be tracked is selected and determined from initial pictures, Object selection process passes through moving object detection side
Method automatically extracts, or is manually specified by man-machine interaction method;
(2) the target location Recurrent networks structure based on block prediction
Target location Recurrent networks based on block prediction are by image input layer, a migration net for being used for feature representation
Network, a network layer and the four part structures of position output layer for including 8 × 8 × 8 nodes for including 4096 × 1 nodes
At;In the entire network, input picture after the dimension normalization of 224 × 224 pixel sizes as the defeated of VGG-19 networks
Enter data, the 23rd layer of the VGG-19 networks is connect entirely with the network layer of 4096 × 1 nodes, that is, uses the of VGG-19
23 layers carry out feature representation to input picture, and the network layer of 4096 × 1 nodes is exported with the position of 8 × 8 × 8 nodes again
Layer is connected entirely;
The input picture of 224 × 224 sizes is divided into 8 × 8=64 image block, each tile size is 28 × 28
The position of pixel, each image block is corresponding with the node location of preceding bidimensional of position output layer, the third dimension of position output layer
8 nodes then indicate the relative position of target that corresponding image block is predicted, thus every input picture is by returning
8 × 8 × 8 relative position values will be obtained after returning network;
(3) training dataset towards tracking generates
It sets Recurrent networks in order to align and is trained, obtain training data by two aspects here:On the one hand
First frame input picture is extracted the image block corresponding to it, according to the target to be tracked then by artificial
Target image block is positioned over any position of first frame image and generates new image by the mode of synthesis, former target image block
The region at place is then filled up with the mean value of target image block, while being recorded the position of target image block placement and being calculated by whole
8 × 88 × 8 × 8 relative positions between image block and target that width image marks off, these position coordinate data conducts
The desired output of network, they collectively form one group of training data with image;On the other hand, then it is the target figure that will first extract
It as block is converted, including translation, rotation, the operations such as distorts and blocks, be then positioned over figure according still further to method as before
As in and compound training image;All these training datas then composing training data set is used for network training later;
(4) network training
In network training process, the image for training is carried out by the way of inputting one by one, VGG-19 network portions
Parameter remains unchanged, the 23rd layer of VGG-19 networks, include the position output layers of 8 × 8 × 8 nodes, and includes 4096 × 1
Connecting quantity between the network layer of node is trained using classical stochastic gradient descent method (SGD);
(5) image inputs
Under real-time disposition, extraction acquires by camera and is stored in the video image of memory block, as will be into
The input picture of line trace;In processed offline, the video file acquired is decomposed into the image sequence of multiple frame compositions
Row extract frame image as input picture one by one sequentially in time.If input picture is sky, whole flow process stops;
(6) target positions
(5) acquisition image is input in the Recurrent networks of position, after the processing of network forward direction, network output layer will obtain 8
× 8 × 8 station-keeping datas.By the calculation processing of these 8 × 8 × 8 node datas to output layer come to target into
Row positioning.If Ai,jIndicate (i, j) a image block of current input image,Indicate image block Ai,jThe abscissa of central point,Indicate image block Ai,jThe ordinate of central point, image block Ai,j8 nodal values difference in corresponding Recurrent networks output layer
For They be respectively target frame upper left corner abscissa with
The difference of image block central point abscissa, the difference of the upper left corner ordinate and image block central point ordinate of target frame, target
The difference of the upper right corner abscissa and image block central point abscissa of frame, upper right corner ordinate and the image block central point of target frame
The lower left corner of the difference of ordinate, the difference of the lower left corner abscissa and image block central point abscissa of target frame, target frame is vertical
The difference of coordinate and image block central point ordinate, the difference of the lower right corner abscissa and image block central point abscissa of target frame
Value, the difference of the lower right corner ordinate and image block central point ordinate of target frame.
If target location is expressed asWherein The abscissa and ordinate in the expression target frame upper left corner respectively, the abscissa and ordinate in the upper right corner,
The abscissa and ordinate in the lower left corner, the abscissa and ordinate in the lower right corner.Then image block Ai,jThe target location of prediction is
I.e. for Ai,jHave Similarly, for there are one each image blocks
The target location respectively predicted, and the coordinate at four angles of target frame predicted is typically different, it is therefore desirable to each image
Four angular coordinates of target frame of block prediction are for statistical analysis, determine each angle of target frame using coordinate accumulation method here
Final coordinate, and then position entire target.Specific method is, if
The upper left corner of target frame, the accumulative matrix of coordinate in the upper right corner, the lower left corner, the lower right corner are indicated respectively, wherein Respectively value of its homography at (a, b), and 0≤a, b≤224, each element value of these matrixes is when initial
0.For image block Ai,jHave Thus for each image block, accumulation operations are carried out to this four matrixes.
Finally, by the coordinate where the element with maximum value in matrix respectively to should be used as the seat at four angles of target frame
Mark, i.e., Wherein The upper left corner, the upper right corner, the lower left corner of expression target frame, the coordinate in the lower right corner add up the member with maximum value in matrix respectively
The corresponding transverse and longitudinal coordinate value of element, target positioning are completed;
(7) network updates
According to the target location that (6) obtain, 8 × 8 marked off by entire image 8 between image block and target are calculated
× 8 × 8 relative positions, and one group of training data is constituted together with current input image, primary network training is carried out, is realized to net
The fine tuning of network updates, and then branches to (5).
The relative position value includes the difference of the upper left corner abscissa and image block central point abscissa of target frame, target
The difference of the upper left corner ordinate and image block central point ordinate of frame, upper right corner abscissa and the image block central point of target frame
The lower left corner of the difference of abscissa, the difference of the upper right corner ordinate and image block central point ordinate of target frame, target frame is horizontal
The difference of coordinate and image block central point abscissa, the difference of the lower left corner ordinate and image block central point ordinate of target frame
Value, the difference of the lower right corner abscissa and image block central point abscissa of target frame, the lower right corner ordinate and image of target frame
The difference of block central point ordinate.
Advantage and good effect:This method builds a position Recurrent networks first, the network by image input layer, one
For the migration network of feature representation, a network layer comprising 4096 × 1 nodes and one comprising 8 × 8 × 8 nodes
Output layer four parts in position form.Image input layer will do input picture unified pretreatment, by image normalization be 224 ×
224 pixel sizes, migration network are pre-training network VGG-19, and are used as feature representation layer, Zhi Houquan using its 23rd layer
A network layer for including 4096 × 1 nodes is connected, the network layer is in one position for including 8 × 8 × 8 nodes of connection entirely
Output layer.It sets Recurrent networks in order to align and is effectively trained, a variety of transformation are carried out to target and image here, then
Corresponding training dataset is synthesized, network training is then carried out using classical stochastic gradient descent method.Input picture is by returning
The prediction about each image block divided by entire image to target location, the prediction can be obtained after the processing of network forward direction
Including 8 × 8 × 8 relative positions, the relative coordinate values at four angles of corresponding target frame.Then from there through use coordinate integrating method
Target can be positioned, and then realize tracking.In addition, after the completion of each target positions, according to currently determined mesh
Cursor position is finely adjusted and updates to network so that network has certain synchronous adjustment ability.Since deep learning is utilized
Its powerful feature representation ability, the present invention can handle complicated tracking scene, realize accurate target following, be based on simultaneously
The method of recurrence avoids a large amount of location finding, and the speed of target positioning is greatly improved, and real-time mesh may be implemented
Mark tracking.In addition, the method for the present invention can be not only used for monotrack, by being correspondingly improved to network (as exported
End), the tracking for multiple target can also be extended.
Description of the drawings
Fig. 1 is structure of the invention figure.
Fig. 2 is flow chart of the present invention.
Specific implementation mode
The method of the present invention can be used for the various occasions of target following, such as intelligent video analysis, automatic human-computer interaction, traffic
Video monitoring, vehicle drive, biocenose analysis, the analysis of field animal movement, crossing moving object segmentation and fluid
It tests the speed on surface.
By taking intelligent video analysis as an example:Intelligent video analysis include it is many it is important automatically analyze task, such as behavioural analysis,
Abnormal alarm, video compress etc., and the basis of these work is then that can carry out stable target following.The present invention may be used
The tracking of proposition is realized, specifically, the position Recurrent networks based on transfer learning are initially set up, as shown in Figure 1, so
A variety of transformation are carried out to target and image afterwards, then synthesize corresponding training dataset, network training then uses the random of classics
Gradient descent method carries out, and network can be obtained the ability positioned to target after the completion of training.During tracking, the recurrence
Network carries out positive processing to input picture, and network will export the corresponding target relative position information of the image, according to these letters
Breath then can be for statistical analysis to target location using coordinate integrating method and completes to position, and then realizes tracking.In addition,
After the completion of each target positioning, network is finely adjusted and is updated according to currently determined target location so that network has
Certain synchronous adjustment ability.Since deep learning its powerful feature representation ability is utilized, the present invention can handle complexity
Tracking scene, realize accurate target following, while the method based on recurrence avoids a large amount of location finding, target positioning
Speed be greatly improved, real-time target following may be implemented.In addition, the method for the present invention can be not only used for single goal
Tracking, by being correspondingly improved (such as output end) to network, can also extend the tracking for multiple target.
The present invention is to establish the position Recurrent networks based on transfer learning first, is then carried out to target and image more
Kind transformation, and then corresponding training dataset is synthesized, network training then using classical stochastic gradient descent method progress, has been trained
It can be obtained the ability positioned to target at rear network.During tracking, which carries out just input picture
To processing, network will export the corresponding target relative position information of the image, and coordinate integrating method is used then according to these information
It can be for statistical analysis to target location and completes to position, and then realizes tracking.In addition, after the completion of each target positions,
Network is finely adjusted and is updated according to currently determined target location so that network has certain synchronous adjustment ability.
The method of the present invention can be programmed by any computer programming language (such as C language) and be realized, based on this method
Tracking system software can realize real-time modeling method application in any PC or embedded system.
Claims (2)
1. a kind of method for tracking target based on transfer learning Recurrent networks, this method is including the use of VGG-19 networks, feature
It is:
(1) Object selection
Select and determine the target object to be tracked from initial pictures, Object selection process by moving target detecting method from
Dynamic extraction, or be manually specified by man-machine interaction method;
(2) the target location Recurrent networks structure based on block prediction
Target location Recurrent networks based on block prediction are used for the migration network of feature representation, one by image input layer, one
A includes that four parts of network layer and a position output layer comprising 8 × 8 × 8 nodes of 4096 × 1 nodes are constituted;
In whole network, input picture is after the dimension normalization of 224 × 224 pixel sizes as the input number of VGG-19 networks
According to the 23rd layer of the VGG-19 networks is connect entirely with the network layer of 4096 × 1 nodes, that is, uses the 23rd layer of VGG-19
To input picture carry out feature representation, and the network layer of 4096 × 1 nodes again with the position output layer of 8 × 8 × 8 nodes into
The full connection of row;
The input picture of 224 × 224 sizes is divided into 8 × 8=64 image block, each tile size is 28 × 28 pictures
The position of element, each image block is corresponding with the node location of preceding bidimensional of position output layer, and the 8 of the third dimension of position output layer
A node then indicates the relative position for the target that corresponding image block is predicted, thus every input picture is by returning net
8 × 8 × 8 relative position values will be obtained after network;
(3) training dataset towards tracking generates
It sets Recurrent networks in order to align and is trained, obtain training data by two aspects here:On the one hand for
First frame input picture extracts the image block corresponding to it, according to the target to be tracked then by artificial synthesized
Mode, target image block is positioned over any position of first frame image and generates new image, where former target image block
Region then filled up with the mean value of target image block, while record target image block placement position and calculate by whole picture figure
As 8 × 88 × 8 × 8 relative positions between image block and target marked off, these position coordinate datas are as network
Desired output, they collectively form one group of training data with image;On the other hand, then it is the target image block that will first extract
It is converted, including translation, rotation, the operations such as distorts and block, be then positioned in image according still further to method as before
And compound training image;All these training datas then composing training data set is used for network training later;
(4) network training
In network training process, the image for training is carried out by the way of inputting one by one, the parameter of VGG-19 network portions
It remains unchanged, the 23rd layer of VGG-19 networks, include the position output layers of 8 × 8 × 8 nodes, and includes 4096 × 1 nodes
Network layer between Connecting quantity, be trained using classical stochastic gradient descent method (SGD);
(5) image inputs
Under real-time disposition, extraction acquires by camera and is stored in the video image of memory block, as to carry out with
The input picture of track;In processed offline, the video file acquired is decomposed into the image sequence of multiple frame compositions, is pressed
According to time sequencing, frame image is extracted one by one as input picture;If input picture is sky, whole flow process stops;
(6) target positions
(5) acquisition image is input in the Recurrent networks of position, after the processing of network forward direction, network output layer will obtain 8 × 8 ×
8 station-keeping datas determine target by the calculation processing of these 8 × 8 × 8 node datas to output layer
Position;If Ai,jIndicate (i, j) a image block of current input image,Indicate image block Ai,jThe abscissa of central point,
Indicate image block Ai,jThe ordinate of central point, image block Ai,j8 nodal values in corresponding Recurrent networks output layer are respectively They are respectively the upper left corner abscissa and figure of target frame
As the difference of block central point abscissa, the difference of the upper left corner ordinate and image block central point ordinate of target frame, target frame
Upper right corner abscissa and image block central point abscissa difference, the upper right corner ordinate of target frame is vertical with image block central point
The difference of coordinate, the difference of the lower left corner abscissa and image block central point abscissa of target frame, the lower left corner of target frame is vertical to be sat
The difference of mark and image block central point ordinate, the difference of the lower right corner abscissa and image block central point abscissa of target frame,
The difference of the lower right corner ordinate and image block central point ordinate of target frame;
If target location is expressed asWherein The abscissa and ordinate in the expression target frame upper left corner respectively, the abscissa and ordinate in the upper right corner,
The abscissa and ordinate in the lower left corner, the abscissa and ordinate in the lower right corner;Then image block Ai,jThe target location of prediction is
I.e. for Ai,jHave For each image block, there are one respectively pre-
The target location of survey, and the coordinate at four angles of target frame predicted is typically different, it is therefore desirable to each image block is predicted
Four angular coordinates of target frame it is for statistical analysis, the final seat at each angle of target frame is determined using coordinate accumulation method here
Mark, and then position entire target;Specific method is, if
The upper left corner of target frame, the accumulative matrix of coordinate in the upper right corner, the lower left corner, the lower right corner are indicated respectively, wherein Respectively value of its homography at (a, b), and 0≤a, b≤224, each element value of these matrixes is equal when initial
It is 0;For image block Ai,jHave Thus for each image block, accumulation operations are carried out to this four matrixes.
Finally, by the coordinate where the element with maximum value in matrix respectively to should be used as the coordinate at four angles of target frame, i.e., Wherein
The upper left corner, the upper right corner, the lower left corner of expression target frame, the coordinate in the lower right corner add up the element pair with maximum value in matrix respectively
The transverse and longitudinal coordinate value answered, target positioning are completed;
(7) network updates
According to the target location that (6) obtain, 8 × 8 marked off by entire image 8 × 8 between image block and target are calculated
× 8 relative positions, and one group of training data is constituted together with current input image, primary network training is carried out, is realized to network
Fine tuning update, then branch to (5).
2. a kind of method for tracking target based on transfer learning Recurrent networks according to claim 1, it is characterised in that:Institute
State the difference that relative position value includes the upper left corner abscissa and image block central point abscissa of target frame, the upper left corner of target frame
The difference of ordinate and image block central point ordinate, the difference of the upper right corner abscissa and image block central point abscissa of target frame
Value, the difference of the upper right corner ordinate and image block central point ordinate of target frame, the lower left corner abscissa and image of target frame
The difference of block central point abscissa, the difference of the lower left corner ordinate and image block central point ordinate of target frame, target frame
The difference of lower right corner abscissa and image block central point abscissa, lower right corner ordinate and the vertical seat of image block central point of target frame
Target difference.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810250785.6A CN108537825B (en) | 2018-03-26 | 2018-03-26 | Target tracking method based on transfer learning regression network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810250785.6A CN108537825B (en) | 2018-03-26 | 2018-03-26 | Target tracking method based on transfer learning regression network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108537825A true CN108537825A (en) | 2018-09-14 |
CN108537825B CN108537825B (en) | 2021-08-17 |
Family
ID=63484603
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810250785.6A Expired - Fee Related CN108537825B (en) | 2018-03-26 | 2018-03-26 | Target tracking method based on transfer learning regression network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108537825B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109493370A (en) * | 2018-10-12 | 2019-03-19 | 西南交通大学 | A kind of method for tracking target based on spatial offset study |
CN110162475A (en) * | 2019-05-27 | 2019-08-23 | 浙江工业大学 | A kind of Software Defects Predict Methods based on depth migration |
CN111127510A (en) * | 2018-11-01 | 2020-05-08 | 杭州海康威视数字技术股份有限公司 | Target object position prediction method and device |
CN113192062A (en) * | 2021-05-25 | 2021-07-30 | 湖北工业大学 | Arterial plaque ultrasonic image self-supervision segmentation method based on image restoration |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103310466A (en) * | 2013-06-28 | 2013-09-18 | 安科智慧城市技术(中国)有限公司 | Single target tracking method and achievement device thereof |
US20160275339A1 (en) * | 2014-01-13 | 2016-09-22 | Carnegie Mellon University | System and Method for Detecting and Tracking Facial Features In Images |
CN107146237A (en) * | 2017-04-24 | 2017-09-08 | 西南交通大学 | A kind of method for tracking target learnt based on presence with estimating |
CN107452023A (en) * | 2017-07-21 | 2017-12-08 | 上海交通大学 | A kind of monotrack method and system based on convolutional neural networks on-line study |
US20170372174A1 (en) * | 2016-06-28 | 2017-12-28 | Conduent Business Services, Llc | System and method for expanding and training convolutional neural networks for large size input images |
-
2018
- 2018-03-26 CN CN201810250785.6A patent/CN108537825B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103310466A (en) * | 2013-06-28 | 2013-09-18 | 安科智慧城市技术(中国)有限公司 | Single target tracking method and achievement device thereof |
US20160275339A1 (en) * | 2014-01-13 | 2016-09-22 | Carnegie Mellon University | System and Method for Detecting and Tracking Facial Features In Images |
US20170372174A1 (en) * | 2016-06-28 | 2017-12-28 | Conduent Business Services, Llc | System and method for expanding and training convolutional neural networks for large size input images |
CN107146237A (en) * | 2017-04-24 | 2017-09-08 | 西南交通大学 | A kind of method for tracking target learnt based on presence with estimating |
CN107452023A (en) * | 2017-07-21 | 2017-12-08 | 上海交通大学 | A kind of monotrack method and system based on convolutional neural networks on-line study |
Non-Patent Citations (4)
Title |
---|
SHUNLI ZHANG等: "Object tracking with adaptive elastic net regression", 《2017 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP)》 * |
卢湖川等: "目标跟踪算法综述", 《模式识别与人工智能》 * |
权伟: "在线学习多重检测的可视对象跟踪方法", 《电子学报》 * |
李玉冰: "基于深度网络的视觉跟踪算法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109493370A (en) * | 2018-10-12 | 2019-03-19 | 西南交通大学 | A kind of method for tracking target based on spatial offset study |
CN109493370B (en) * | 2018-10-12 | 2021-07-02 | 西南交通大学 | Target tracking method based on space offset learning |
CN111127510A (en) * | 2018-11-01 | 2020-05-08 | 杭州海康威视数字技术股份有限公司 | Target object position prediction method and device |
CN111127510B (en) * | 2018-11-01 | 2023-10-27 | 杭州海康威视数字技术股份有限公司 | Target object position prediction method and device |
CN110162475A (en) * | 2019-05-27 | 2019-08-23 | 浙江工业大学 | A kind of Software Defects Predict Methods based on depth migration |
CN113192062A (en) * | 2021-05-25 | 2021-07-30 | 湖北工业大学 | Arterial plaque ultrasonic image self-supervision segmentation method based on image restoration |
Also Published As
Publication number | Publication date |
---|---|
CN108537825B (en) | 2021-08-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110210551B (en) | Visual target tracking method based on adaptive subject sensitivity | |
CN110059558B (en) | Orchard obstacle real-time detection method based on improved SSD network | |
CN109711262B (en) | Intelligent excavator pedestrian detection method based on deep convolutional neural network | |
CN108171141B (en) | Attention model-based cascaded multi-mode fusion video target tracking method | |
CN108537825A (en) | A kind of method for tracking target based on transfer learning Recurrent networks | |
CN110660082A (en) | Target tracking method based on graph convolution and trajectory convolution network learning | |
CN107146237B (en) | Target tracking method based on online state learning and estimation | |
CN111626128A (en) | Improved YOLOv 3-based pedestrian detection method in orchard environment | |
CN109766873B (en) | Pedestrian re-identification method based on hybrid deformable convolution | |
CN110659664B (en) | SSD-based high-precision small object identification method | |
CN112818925B (en) | Urban building and crown identification method | |
CN113240691A (en) | Medical image segmentation method based on U-shaped network | |
Ren et al. | A novel squeeze YOLO-based real-time people counting approach | |
CN109993770A (en) | A kind of method for tracking target of adaptive space-time study and state recognition | |
WO2023030182A1 (en) | Image generation method and apparatus | |
CN105243154A (en) | Remote sensing image retrieval method and system based on significant point characteristics and spare self-encodings | |
CN110334584B (en) | Gesture recognition method based on regional full convolution network | |
CN115115859A (en) | Long linear engineering construction progress intelligent identification and analysis method based on unmanned aerial vehicle aerial photography | |
CN106599810A (en) | Head pose estimation method based on stacked auto-encoding | |
CN116402851A (en) | Infrared dim target tracking method under complex background | |
Chun-Lei et al. | Intelligent detection for tunnel shotcrete spray using deep learning and LiDAR | |
CN109493370A (en) | A kind of method for tracking target based on spatial offset study | |
CN106530330A (en) | Low-rank sparse-based video target tracking method | |
CN109272036A (en) | A kind of random fern method for tracking target based on depth residual error network | |
CN111368637B (en) | Transfer robot target identification method based on multi-mask convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20210817 |