CN108537825B - Target tracking method based on transfer learning regression network - Google Patents
Target tracking method based on transfer learning regression network Download PDFInfo
- Publication number
- CN108537825B CN108537825B CN201810250785.6A CN201810250785A CN108537825B CN 108537825 B CN108537825 B CN 108537825B CN 201810250785 A CN201810250785 A CN 201810250785A CN 108537825 B CN108537825 B CN 108537825B
- Authority
- CN
- China
- Prior art keywords
- image
- target
- network
- ordinate
- abscissa
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 238000013526 transfer learning Methods 0.000 title claims abstract description 9
- 238000012549 training Methods 0.000 claims abstract description 41
- 238000012545 processing Methods 0.000 claims abstract description 11
- 230000008569 process Effects 0.000 claims description 12
- 238000009825 accumulation Methods 0.000 claims description 10
- 238000001514 detection method Methods 0.000 claims description 5
- 230000003993 interaction Effects 0.000 claims description 4
- 238000013508 migration Methods 0.000 claims description 4
- 230000005012 migration Effects 0.000 claims description 4
- 230000015572 biosynthetic process Effects 0.000 claims description 2
- 238000010276 construction Methods 0.000 claims description 2
- 230000004807 localization Effects 0.000 claims description 2
- 239000011159 matrix material Substances 0.000 claims description 2
- 238000010606 normalization Methods 0.000 claims description 2
- 238000003786 synthesis reaction Methods 0.000 claims description 2
- 238000013519 translation Methods 0.000 claims description 2
- 238000013135 deep learning Methods 0.000 description 10
- 238000004458 analytical method Methods 0.000 description 8
- 238000011160 research Methods 0.000 description 8
- 230000009466 transformation Effects 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000011478 gradient descent method Methods 0.000 description 3
- 230000001360 synchronised effect Effects 0.000 description 3
- 238000000844 transformation Methods 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000033001 locomotion Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000000513 principal component analysis Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/248—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20021—Dividing image into blocks, subimages or windows
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Abstract
The invention provides a target tracking method based on a transfer learning regression network, and relates to the technical field of computer vision. Selecting and determining a target object to be tracked from the initial image; constructing a target position regression network based on block prediction; generating a tracking-oriented training data set and training a network; image input, in the real-time processing condition, extracting the video image collected by the camera and stored in the storage area as the input image to be tracked; and (4) positioning the target, inputting the obtained image into a position returning network, and after forward processing of the network, obtaining 8 × 8 × 8 relative position data by a network output layer. The network update calculates 8 × 8 × 8 relative positions between the 8 × 8 image blocks divided from the entire image and the target according to the obtained target position, and forms a set of training data together with the current input image.
Description
Technical Field
The present invention relates to the technical field of computer vision, computer graphic images, machine intelligence and systems.
Background
Visual target tracking is an important research subject in the field of computer vision, and the main task of the visual target tracking is to acquire information such as continuous positions, appearances and motions of targets and further provide a basis for further semantic layer analysis (such as behavior recognition and scene understanding). The target tracking research is widely applied to the fields of intelligent monitoring, man-machine interaction, automatic control systems and the like, and has strong practical value. At present, target tracking methods mainly include a classical target tracking method and a deep learning target tracking method.
The classical target tracking Methods are mainly classified into a Generative method (Generative Methods) and a Discriminative method (Discriminative Methods). Generative methods assume that the target can be expressed through some kind of generation process or model, such as Principal Component Analysis (PCA), Sparse Coding (Sparse Coding), etc., and then consider the tracking problem as finding the most likely candidate in the region of interest. These methods aim at designing an image representation method that facilitates robust target tracking. Unlike the generative method, the discriminant method treats tracking as a classification or a continuous object detection problem, whose task is to distinguish objects from the image background. This type of method, which utilizes both target and background information, is currently the main method of research. Discriminant methods typically involve two main steps, the first being training to derive a classifier and its decision rules by selecting visual features that discriminate between target and background, and the second being using the classifier for evaluation of each location within the field of view and to determine the most likely target location during tracking. The target frame is then moved to that location and the process is repeated to effect tracking, and the framework is used to design various forms of tracking algorithms. In general, the main advantages of classical tracking methods are the speed of operation and the low dependence on auxiliary data, while they also require a trade-off between accuracy and real-time performance of the tracking.
Deep Learning (Deep Learning), which is a hot spot of machine Learning research in recent years, has been surprisingly successful in many aspects, such as speech recognition, image recognition, object detection, video classification, etc., due to its powerful feature expression capability and evolving data sets and hardware support. The deep learning target tracking research is also developed rapidly, but due to the lack of prior knowledge in target tracking and the requirement of real-time performance, the deep learning technology based on a large amount of training data and parameter calculation is difficult to be fully developed in this respect, and has a large exploration space. From the current research results, the deep learning tracking method mainly applies an auto-encoder network and a convolutional neural network, and the research mainly has two ideas, one is to perform transfer learning on the network and then perform online fine tuning, and the other is to modify the structure of the deep network to adapt to the tracking requirement. An auto-encoder network (AE) is a typical unsupervised deep learning network, as its feature learning capability and anti-noise performance are first applied to target tracking. In a comprehensive view, the self-encoder network is intuitive and moderate in size, is an excellent unsupervised deep learning model, and is applied to tracking firstly and obtains a better effect. In contrast to self-encoder networks, Convolutional Neural Networks (CNNs) are supervised feedforward neural networks, which involve a number of cyclically alternating convolution, nonlinear transformation and downsampling operations, and exhibit very powerful performance in pattern recognition, especially in computer vision tasks. In general, deep learning has stronger feature expression capability compared with the classical method, and further research is still needed in the aspects of selection of related training sets, improvement of network selection and structure, real-time performance of algorithms, application of recurrent neural networks and the like in the tracking method.
Disclosure of Invention
The invention aims to provide a target tracking method based on a transfer learning regression network, and a deep neural network is used for solving the problems of inaccurate training data and target positioning during tracking.
The purpose of the invention is realized by the following technical scheme: the method comprises the following steps of utilizing a VGG-19 network:
(1) target selection
Selecting and determining a target object to be tracked from the initial image, wherein the target selection process is automatically extracted by a moving target detection method or manually specified by a human-computer interaction method;
(2) target position regression network construction based on block prediction
The target position regression network based on block prediction is composed of four parts, namely an image input layer, a migration network for feature expression, a network layer containing 4096 x 1 nodes and a position output layer containing 8 x 8 nodes; in the whole network, an input image is used as input data of a VGG-19 network after being subjected to scale normalization with the size of 224 multiplied by 224 pixels, the 23 th layer of the VGG-19 network is fully connected with the network layer of 4096 multiplied by 1 nodes, namely the 23 th layer of the VGG-19 is adopted to carry out feature expression on the input image, and the network layer of the 4096 multiplied by 1 nodes is fully connected with the position output layer of 8 multiplied by 8 nodes;
dividing an input image with the size of 224 × 224 into 8 × 8-64 image blocks, wherein each image block is 28 × 28 pixels in size, the position of each image block corresponds to the position of a node in the first two dimensions of the position output layer, and 8 nodes in the third dimension of the position output layer represent the relative positions of targets predicted by the corresponding image blocks, so that each input image passes through a regression network to obtain 8 × 8 × 8 relative position values;
(3) trace-oriented training data set generation
To be able to train a location-homing network, the training data is here acquired by two aspects: on one hand, for a first frame input image, extracting corresponding image blocks according to a target to be tracked, then placing the target image blocks at any position of the first frame image in a manual synthesis mode to generate a new image, filling the area where the original target image blocks are located by using the mean value of the target image blocks, simultaneously recording the position where the target image blocks are placed and calculating 8 x 8 relative positions between 8 x 8 image blocks divided by the whole image and the target, using the position coordinate data as expected output of a network, and forming a group of training data together with the image; on the other hand, the extracted target image block is firstly transformed, including operations such as translation, rotation, distortion and shielding, and then is placed in the image according to the same method and a training image is synthesized; all the training data form a training data set and are then used for network training;
(4) network training
In the network training process, images used for training are input one by one, parameters of a VGG-19 network part are kept unchanged, and connection parameters between a 23 th layer of the VGG-19 network, a position output layer containing 8 x 8 nodes and a network layer containing 4096 x 1 nodes are trained by adopting a classical random gradient descent (SGD);
(5) image input
Under the condition of real-time processing, extracting a video image which is acquired by a camera and stored in a storage area as an input image to be tracked; under the condition of off-line processing, the acquired video file is decomposed into an image sequence consisting of a plurality of frames, and the frame images are extracted one by one as input images according to the time sequence. If the input image is empty, the whole process is stopped;
(6) target localization
And (5) inputting the obtained image into a position returning network, and after forward processing of the network, obtaining 8 × 8 × 8 relative position data by a network output layer. The target is located by the calculation processing of these 8 × 8 × 8 node data of the output layer. Let Ai,jAn (i, j) -th image block representing the current input image,representing image blocks Ai,jThe abscissa of the center point is the axis of the circle,representing image blocks Ai,jOrdinate of center point, image block Ai,jThe corresponding 8 node values in the regression network output layer are respectively The difference values are respectively the difference value between the abscissa of the upper left corner of the target frame and the abscissa of the image block center point, the difference value between the ordinate of the upper left corner of the target frame and the ordinate of the image block center point, the difference value between the abscissa of the upper right corner of the target frame and the abscissa of the image block center point, the difference value between the abscissa of the lower left corner of the target frame and the abscissa of the image block center point, the difference value between the ordinate of the lower left corner of the target frame and the ordinate of the image block center point, the difference value between the abscissa of the lower right corner of the target frame and the abscissa of the image block center point, and the difference value between the ordinate of the lower right corner of the target frame and the ordinate of the image block center point.
Setting target positionIs shown asWherein Respectively represents the abscissa and ordinate of the upper left corner of the target frame, the abscissa and ordinate of the upper right corner, the abscissa and ordinate of the lower left corner, and the abscissa and ordinate of the lower right corner. The image block Ai,jThe predicted target position is I.e. for Ai,jIs provided with Similarly, each image block has a respective predicted target position, and the coordinates of the four corners of the predicted target frame are usually different, so that the coordinates of the four corners of the predicted target frame of each image block need to be statistically analyzed, and a coordinate accumulation method is adopted to determine the final coordinates of each corner of the target frame, so as to locate the whole target. The concrete method comprises the following steps ofRespectively representing the coordinate accumulation matrixes of the upper left corner, the upper right corner, the lower left corner and the lower right corner of the target frame, wherein Are respectively the corresponding moment thereofThe values of the matrices at (a, b), with 0 ≦ a, b ≦ 224, and each element value of these matrices is initially 0. For image block Ai,jIs provided with Whereby for each image block the four matrices are accumulated.
Finally, the coordinates of the element with the maximum value in the matrix are respectively and correspondingly used as the coordinates of the four corners of the target frame, namely Wherein Respectively representing the horizontal and vertical coordinate values corresponding to the element with the maximum value in the coordinate accumulation matrixes of the upper left corner, the upper right corner, the lower left corner and the lower right corner of the target frame, and finishing target positioning;
(7) network update
And (4) according to the target position obtained in the step (6), 8 × 8 × 8 relative positions between 8 × 8 image blocks divided from the whole image and the target are calculated, a group of training data is formed together with the current input image, network training is performed for one time, fine tuning and updating of the network are achieved, and then the step (5) is skipped.
The relative position value comprises a difference value between an abscissa of the upper left corner of the target frame and an abscissa of the center point of the image block, a difference value between an ordinate of the upper left corner of the target frame and an ordinate of the center point of the image block, a difference value between an abscissa of the upper right corner of the target frame and an abscissa of the center point of the image block, a difference value between an ordinate of the upper right corner of the target frame and an ordinate of the center point of the image block, a difference value between an ordinate of the lower left corner of the target frame and an ordinate of the center point of the image block, a difference value between an abscissa of the lower right corner of the target frame and an abscissa of the center point of the image block, and a difference value between the ordinate of the.
The advantages and positive effects are as follows: the method comprises the steps of firstly constructing a position regression network, wherein the network consists of four parts, namely an image input layer, a migration network for feature expression, a network layer containing 4096 x 1 nodes and a position output layer containing 8 x 8 nodes. The image input layer carries out uniform preprocessing on input images, the images are normalized to be 224 x 224 pixel size, the migration network is a pre-training network VGG-19, the 23 th layer of the pre-training network is used as a feature expression layer, then a network layer containing 4096 x 1 nodes is fully connected, and the network layer is fully connected with a position output layer containing 8 x 8 nodes. In order to effectively train the position regression network, a plurality of transformations are carried out on the target and the image, then a corresponding training data set is synthesized, and the network training is carried out by adopting a classical random gradient descent method. The input image is processed forward by the regression network to obtain the prediction of the target position for each image block divided by the whole image, wherein the prediction comprises 8 × 8 × 8 relative positions and corresponding to the relative coordinate values of the four corners of the target frame. Therefore, the target can be positioned by adopting a coordinate accumulation method, and tracking is further realized. In addition, after the target positioning is finished each time, the network is finely adjusted and updated according to the currently determined target position, so that the network has certain synchronous adjustment capability. The invention can process complex tracking scene and realize accurate target tracking by utilizing the strong characteristic expression ability of deep learning, and simultaneously, the regression-based method avoids a large amount of position search, greatly improves the target positioning speed and can realize real-time target tracking. In addition, the method can be used for single-target tracking, and can also be expanded to be used for multi-target tracking by correspondingly improving the network (such as output end).
Drawings
FIG. 1 is a block diagram of the present invention.
FIG. 2 is a flow chart of the present invention.
Detailed Description
The method can be used in various occasions of target tracking, such as intelligent video analysis, automatic man-machine interaction, traffic video monitoring, unmanned vehicle driving, biological population analysis, field animal motion analysis, crossing moving object detection, fluid surface speed measurement and the like.
Take intelligent video analysis as an example: the intelligent video analysis comprises a plurality of important automatic analysis tasks such as behavior analysis, abnormal alarm, video compression and the like, and the basis of the tasks is to perform stable target tracking. The tracking method can be realized by adopting the tracking method provided by the invention, specifically, firstly, a position regression network based on transfer learning is established, as shown in figure 1, then, a plurality of transformations are carried out on the target and the image, then, a corresponding training data set is synthesized, the network training is carried out by adopting a classical random gradient descent method, and the network can obtain the capability of positioning the target after the training is finished. In the tracking process, the regression network carries out forward processing on an input image, the network outputs the relative position information of a target corresponding to the image, and the target position can be statistically analyzed and positioned by adopting a coordinate accumulation method according to the information, so that tracking is realized. In addition, after the target positioning is finished each time, the network is finely adjusted and updated according to the currently determined target position, so that the network has certain synchronous adjustment capability. The invention can process complex tracking scene and realize accurate target tracking by utilizing the strong characteristic expression ability of deep learning, and simultaneously, the regression-based method avoids a large amount of position search, greatly improves the target positioning speed and can realize real-time target tracking. In addition, the method can be used for single-target tracking, and can also be expanded to be used for multi-target tracking by correspondingly improving the network (such as output end).
The method comprises the steps of firstly establishing a position regression network based on transfer learning, then carrying out various transformations on a target and an image, further synthesizing a corresponding training data set, carrying out network training by adopting a classical random gradient descent method, and obtaining the capability of positioning the target by the network after the training is finished. In the tracking process, the regression network carries out forward processing on an input image, the network outputs the relative position information of a target corresponding to the image, and the target position can be statistically analyzed and positioned by adopting a coordinate accumulation method according to the information, so that tracking is realized. In addition, after the target positioning is finished each time, the network is finely adjusted and updated according to the currently determined target position, so that the network has certain synchronous adjustment capability.
The method can be realized by programming in any computer programming language (such as C language), and the tracking system software based on the method can realize real-time target tracking application in any PC or embedded system.
Claims (2)
1. A target tracking method based on a transfer learning regression network comprises the following steps of utilizing a VGG-19 network, and being characterized in that:
(1) target selection
Selecting and determining a target object to be tracked from the initial image, wherein the target selection process is automatically extracted by a moving target detection method or manually specified by a human-computer interaction method;
(2) target position regression network construction based on block prediction
The target position regression network based on block prediction is composed of four parts, namely an image input layer, a migration network for feature expression, a network layer containing 4096 x 1 nodes and a position output layer containing 8 x 8 nodes; in the whole network, an input image is used as input data of a VGG-19 network after being subjected to scale normalization with the size of 224 multiplied by 224 pixels, the 23 th layer of the VGG-19 network is fully connected with the network layer of 4096 multiplied by 1 nodes, namely the 23 th layer of the VGG-19 is adopted to carry out feature expression on the input image, and the network layer of the 4096 multiplied by 1 nodes is fully connected with the position output layer of 8 multiplied by 8 nodes;
dividing an input image with the size of 224 × 224 into 8 × 8-64 image blocks, wherein each image block is 28 × 28 pixels in size, the position of each image block corresponds to the position of a node in the first two dimensions of the position output layer, and 8 nodes in the third dimension of the position output layer represent the relative positions of targets predicted by the corresponding image blocks, so that each input image passes through a regression network to obtain 8 × 8 × 8 relative position values;
(3) trace-oriented training data set generation
To be able to train a location-homing network, the training data is here acquired by two aspects: on one hand, for a first frame input image, extracting corresponding image blocks according to a target to be tracked, then placing the target image blocks at any position of the first frame image in a manual synthesis mode to generate a new image, filling the area where the original target image blocks are located by using the mean value of the target image blocks, simultaneously recording the position where the target image blocks are placed and calculating 8 x 8 relative positions between 8 x 8 image blocks divided by the whole image and the target, using the position coordinate data as expected output of a network, and forming a group of training data together with the image; on the other hand, the extracted target image block is firstly transformed, including operations such as translation, rotation, distortion and shielding, and then is placed in the image according to the same method and a training image is synthesized; all the training data form a training data set and are then used for network training;
(4) network training
In the network training process, images used for training are input one by one, parameters of a VGG-19 network part are kept unchanged, and connection parameters between a 23 th layer of the VGG-19 network, a position output layer containing 8 x 8 nodes and a network layer containing 4096 x 1 nodes are trained by adopting a classical random gradient descent (SGD);
(5) image input
Under the condition of real-time processing, extracting a video image which is acquired by a camera and stored in a storage area as an input image to be tracked; under the condition of off-line processing, decomposing an acquired video file into an image sequence consisting of a plurality of frames, and extracting frame images one by one as input images according to a time sequence; if the input image is empty, the whole process is stopped;
(6) target localization
Inputting the image obtained in the step (5) into a position returning network, after the forward processing of the network, obtaining 8 multiplied by 8 relative position data by a network output layer, and positioning the target by computing the 8 multiplied by 8 node data of the output layer; let Ai,jAn (i, j) -th image block representing the current input image,representing image blocks Ai,jThe abscissa of the center point is the axis of the circle,representing image blocks Ai,jOrdinate of center point, image block Ai,jThe corresponding 8 node values in the regression network output layer are respectively The difference values are respectively the difference value between the abscissa of the upper left corner of the target frame and the abscissa of the center point of the image block, the difference value between the ordinate of the upper left corner of the target frame and the ordinate of the center point of the image block, the difference value between the abscissa of the upper right corner of the target frame and the ordinate of the center point of the image block, the difference value between the abscissa of the lower left corner of the target frame and the abscissa of the center point of the image block, the difference value between the ordinate of the lower left corner of the target frame and the ordinate of the center point of the image block, the difference value between the abscissa of the lower right corner of the target frame and the abscissa of the center point of the image block, and the difference value between the ordinate of the lower right corner of the target frame and the ordinate of the center point of the image block;
the target position is expressed asWherein Respectively representing the abscissa and ordinate of the upper left corner of the target frame, the abscissa and ordinate of the upper right corner, the abscissa and ordinate of the lower left corner and the abscissa and ordinate of the lower right corner; the image block Ai,jThe predicted target position is I.e. for Ai,jIs provided with Each image block has a respective predicted target position, and the coordinates of the four corners of the predicted target frame are usually different, so the coordinates of the four corners of the predicted target frame of each image block need to be statistically analyzed, and a coordinate accumulation method is adopted to determine the final coordinates of each corner of the target frame, so that the whole target is positioned; the concrete method comprises the following steps ofRespectively representing the coordinate accumulation matrixes of the upper left corner, the upper right corner, the lower left corner and the lower right corner of the target frame, wherein The values of the corresponding matrixes at (a and b) are respectively, 0 is less than or equal to a, b is less than or equal to 224, and each element value of the matrixes is 0 initially; for image block Ai,jIs provided with Thus for each image block, the four matrices are subjected to an accumulation operation;
finally, the coordinates of the element with the maximum value in the matrix are respectively and correspondingly used as the coordinates of the four corners of the target frame, namely Wherein Respectively representing the horizontal and vertical coordinate values corresponding to the element with the maximum value in the coordinate accumulation matrixes of the upper left corner, the upper right corner, the lower left corner and the lower right corner of the target frame, and finishing target positioning;
(7) network update
And (4) according to the target position obtained in the step (6), 8 × 8 × 8 relative positions between 8 × 8 image blocks divided from the whole image and the target are calculated, a group of training data is formed together with the current input image, network training is performed for one time, fine tuning and updating of the network are achieved, and then the step (5) is skipped.
2. The target tracking method based on the transfer learning regression network according to claim 1, characterized in that: the relative position value comprises a difference value between an abscissa of the upper left corner of the target frame and an abscissa of the center point of the image block, a difference value between an ordinate of the upper left corner of the target frame and an ordinate of the center point of the image block, a difference value between an abscissa of the upper right corner of the target frame and an abscissa of the center point of the image block, a difference value between an ordinate of the upper right corner of the target frame and an ordinate of the center point of the image block, a difference value between an ordinate of the lower left corner of the target frame and an ordinate of the center point of the image block, a difference value between an abscissa of the lower right corner of the target frame and an abscissa of the center point of the image block, and a difference value between the ordinate of the lower right corner of the target frame and the ordinate of the center point of the image block.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810250785.6A CN108537825B (en) | 2018-03-26 | 2018-03-26 | Target tracking method based on transfer learning regression network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810250785.6A CN108537825B (en) | 2018-03-26 | 2018-03-26 | Target tracking method based on transfer learning regression network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108537825A CN108537825A (en) | 2018-09-14 |
CN108537825B true CN108537825B (en) | 2021-08-17 |
Family
ID=63484603
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810250785.6A Expired - Fee Related CN108537825B (en) | 2018-03-26 | 2018-03-26 | Target tracking method based on transfer learning regression network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108537825B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109493370B (en) * | 2018-10-12 | 2021-07-02 | 西南交通大学 | Target tracking method based on space offset learning |
CN111127510B (en) * | 2018-11-01 | 2023-10-27 | 杭州海康威视数字技术股份有限公司 | Target object position prediction method and device |
CN110162475B (en) * | 2019-05-27 | 2023-04-18 | 浙江工业大学 | Software defect prediction method based on deep migration |
CN113192062A (en) * | 2021-05-25 | 2021-07-30 | 湖北工业大学 | Arterial plaque ultrasonic image self-supervision segmentation method based on image restoration |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103310466B (en) * | 2013-06-28 | 2016-02-17 | 安科智慧城市技术(中国)有限公司 | A kind of monotrack method and implement device thereof |
US9928405B2 (en) * | 2014-01-13 | 2018-03-27 | Carnegie Mellon University | System and method for detecting and tracking facial features in images |
US10303977B2 (en) * | 2016-06-28 | 2019-05-28 | Conduent Business Services, Llc | System and method for expanding and training convolutional neural networks for large size input images |
CN107146237B (en) * | 2017-04-24 | 2020-02-18 | 西南交通大学 | Target tracking method based on online state learning and estimation |
CN107452023A (en) * | 2017-07-21 | 2017-12-08 | 上海交通大学 | A kind of monotrack method and system based on convolutional neural networks on-line study |
-
2018
- 2018-03-26 CN CN201810250785.6A patent/CN108537825B/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN108537825A (en) | 2018-09-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110660082B (en) | Target tracking method based on graph convolution and trajectory convolution network learning | |
Postels et al. | Sampling-free epistemic uncertainty estimation using approximated variance propagation | |
CN109800689B (en) | Target tracking method based on space-time feature fusion learning | |
Mukhoti et al. | Evaluating bayesian deep learning methods for semantic segmentation | |
Wang et al. | Predrnn: Recurrent neural networks for predictive learning using spatiotemporal lstms | |
CN108537825B (en) | Target tracking method based on transfer learning regression network | |
CN111626128B (en) | Pedestrian detection method based on improved YOLOv3 in orchard environment | |
CN107146237B (en) | Target tracking method based on online state learning and estimation | |
CN108416266B (en) | Method for rapidly identifying video behaviors by extracting moving object through optical flow | |
CN109993770B (en) | Target tracking method for adaptive space-time learning and state recognition | |
CN112348849A (en) | Twin network video target tracking method and device | |
CN114022759A (en) | Airspace finite pixel target detection system and method fusing neural network space-time characteristics | |
CN107680116A (en) | A kind of method for monitoring moving object in video sequences | |
CN110827320B (en) | Target tracking method and device based on time sequence prediction | |
CN113221787A (en) | Pedestrian multi-target tracking method based on multivariate difference fusion | |
CN109493370B (en) | Target tracking method based on space offset learning | |
CN107798329B (en) | CNN-based adaptive particle filter target tracking method | |
DE102022210129A1 (en) | IMAGE PROCESSING VIA ISOTONIC CONVOLUTIONAL NEURONAL NETWORKS | |
CN109272036B (en) | Random fern target tracking method based on depth residual error network | |
CN110111358B (en) | Target tracking method based on multilayer time sequence filtering | |
CN112967267B (en) | Laser directional energy deposition sputtering counting method of full convolution neural network | |
CN114581485A (en) | Target tracking method based on language modeling pattern twin network | |
Wang et al. | Unsupervised Defect Segmentation in Selective Laser Melting | |
Zhang et al. | An Improved Detection Algorithm For Pre-processing Problem Based On PointPillars | |
Zhou et al. | MoD-SLAM: Monocular Dense Mapping for Unbounded 3D Scene Reconstruction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20210817 |