CN108537825B - Target tracking method based on transfer learning regression network - Google Patents

Target tracking method based on transfer learning regression network Download PDF

Info

Publication number
CN108537825B
CN108537825B CN201810250785.6A CN201810250785A CN108537825B CN 108537825 B CN108537825 B CN 108537825B CN 201810250785 A CN201810250785 A CN 201810250785A CN 108537825 B CN108537825 B CN 108537825B
Authority
CN
China
Prior art keywords
image
target
network
ordinate
abscissa
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201810250785.6A
Other languages
Chinese (zh)
Other versions
CN108537825A (en
Inventor
权伟
李天瑞
江永全
何武
刘跃平
卢学民
王晔
贾成君
陈锦雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Jiaotong University
Original Assignee
Southwest Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Jiaotong University filed Critical Southwest Jiaotong University
Priority to CN201810250785.6A priority Critical patent/CN108537825B/en
Publication of CN108537825A publication Critical patent/CN108537825A/en
Application granted granted Critical
Publication of CN108537825B publication Critical patent/CN108537825B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/248Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention provides a target tracking method based on a transfer learning regression network, and relates to the technical field of computer vision. Selecting and determining a target object to be tracked from the initial image; constructing a target position regression network based on block prediction; generating a tracking-oriented training data set and training a network; image input, in the real-time processing condition, extracting the video image collected by the camera and stored in the storage area as the input image to be tracked; and (4) positioning the target, inputting the obtained image into a position returning network, and after forward processing of the network, obtaining 8 × 8 × 8 relative position data by a network output layer. The network update calculates 8 × 8 × 8 relative positions between the 8 × 8 image blocks divided from the entire image and the target according to the obtained target position, and forms a set of training data together with the current input image.

Description

Target tracking method based on transfer learning regression network
Technical Field
The present invention relates to the technical field of computer vision, computer graphic images, machine intelligence and systems.
Background
Visual target tracking is an important research subject in the field of computer vision, and the main task of the visual target tracking is to acquire information such as continuous positions, appearances and motions of targets and further provide a basis for further semantic layer analysis (such as behavior recognition and scene understanding). The target tracking research is widely applied to the fields of intelligent monitoring, man-machine interaction, automatic control systems and the like, and has strong practical value. At present, target tracking methods mainly include a classical target tracking method and a deep learning target tracking method.
The classical target tracking Methods are mainly classified into a Generative method (Generative Methods) and a Discriminative method (Discriminative Methods). Generative methods assume that the target can be expressed through some kind of generation process or model, such as Principal Component Analysis (PCA), Sparse Coding (Sparse Coding), etc., and then consider the tracking problem as finding the most likely candidate in the region of interest. These methods aim at designing an image representation method that facilitates robust target tracking. Unlike the generative method, the discriminant method treats tracking as a classification or a continuous object detection problem, whose task is to distinguish objects from the image background. This type of method, which utilizes both target and background information, is currently the main method of research. Discriminant methods typically involve two main steps, the first being training to derive a classifier and its decision rules by selecting visual features that discriminate between target and background, and the second being using the classifier for evaluation of each location within the field of view and to determine the most likely target location during tracking. The target frame is then moved to that location and the process is repeated to effect tracking, and the framework is used to design various forms of tracking algorithms. In general, the main advantages of classical tracking methods are the speed of operation and the low dependence on auxiliary data, while they also require a trade-off between accuracy and real-time performance of the tracking.
Deep Learning (Deep Learning), which is a hot spot of machine Learning research in recent years, has been surprisingly successful in many aspects, such as speech recognition, image recognition, object detection, video classification, etc., due to its powerful feature expression capability and evolving data sets and hardware support. The deep learning target tracking research is also developed rapidly, but due to the lack of prior knowledge in target tracking and the requirement of real-time performance, the deep learning technology based on a large amount of training data and parameter calculation is difficult to be fully developed in this respect, and has a large exploration space. From the current research results, the deep learning tracking method mainly applies an auto-encoder network and a convolutional neural network, and the research mainly has two ideas, one is to perform transfer learning on the network and then perform online fine tuning, and the other is to modify the structure of the deep network to adapt to the tracking requirement. An auto-encoder network (AE) is a typical unsupervised deep learning network, as its feature learning capability and anti-noise performance are first applied to target tracking. In a comprehensive view, the self-encoder network is intuitive and moderate in size, is an excellent unsupervised deep learning model, and is applied to tracking firstly and obtains a better effect. In contrast to self-encoder networks, Convolutional Neural Networks (CNNs) are supervised feedforward neural networks, which involve a number of cyclically alternating convolution, nonlinear transformation and downsampling operations, and exhibit very powerful performance in pattern recognition, especially in computer vision tasks. In general, deep learning has stronger feature expression capability compared with the classical method, and further research is still needed in the aspects of selection of related training sets, improvement of network selection and structure, real-time performance of algorithms, application of recurrent neural networks and the like in the tracking method.
Disclosure of Invention
The invention aims to provide a target tracking method based on a transfer learning regression network, and a deep neural network is used for solving the problems of inaccurate training data and target positioning during tracking.
The purpose of the invention is realized by the following technical scheme: the method comprises the following steps of utilizing a VGG-19 network:
(1) target selection
Selecting and determining a target object to be tracked from the initial image, wherein the target selection process is automatically extracted by a moving target detection method or manually specified by a human-computer interaction method;
(2) target position regression network construction based on block prediction
The target position regression network based on block prediction is composed of four parts, namely an image input layer, a migration network for feature expression, a network layer containing 4096 x 1 nodes and a position output layer containing 8 x 8 nodes; in the whole network, an input image is used as input data of a VGG-19 network after being subjected to scale normalization with the size of 224 multiplied by 224 pixels, the 23 th layer of the VGG-19 network is fully connected with the network layer of 4096 multiplied by 1 nodes, namely the 23 th layer of the VGG-19 is adopted to carry out feature expression on the input image, and the network layer of the 4096 multiplied by 1 nodes is fully connected with the position output layer of 8 multiplied by 8 nodes;
dividing an input image with the size of 224 × 224 into 8 × 8-64 image blocks, wherein each image block is 28 × 28 pixels in size, the position of each image block corresponds to the position of a node in the first two dimensions of the position output layer, and 8 nodes in the third dimension of the position output layer represent the relative positions of targets predicted by the corresponding image blocks, so that each input image passes through a regression network to obtain 8 × 8 × 8 relative position values;
(3) trace-oriented training data set generation
To be able to train a location-homing network, the training data is here acquired by two aspects: on one hand, for a first frame input image, extracting corresponding image blocks according to a target to be tracked, then placing the target image blocks at any position of the first frame image in a manual synthesis mode to generate a new image, filling the area where the original target image blocks are located by using the mean value of the target image blocks, simultaneously recording the position where the target image blocks are placed and calculating 8 x 8 relative positions between 8 x 8 image blocks divided by the whole image and the target, using the position coordinate data as expected output of a network, and forming a group of training data together with the image; on the other hand, the extracted target image block is firstly transformed, including operations such as translation, rotation, distortion and shielding, and then is placed in the image according to the same method and a training image is synthesized; all the training data form a training data set and are then used for network training;
(4) network training
In the network training process, images used for training are input one by one, parameters of a VGG-19 network part are kept unchanged, and connection parameters between a 23 th layer of the VGG-19 network, a position output layer containing 8 x 8 nodes and a network layer containing 4096 x 1 nodes are trained by adopting a classical random gradient descent (SGD);
(5) image input
Under the condition of real-time processing, extracting a video image which is acquired by a camera and stored in a storage area as an input image to be tracked; under the condition of off-line processing, the acquired video file is decomposed into an image sequence consisting of a plurality of frames, and the frame images are extracted one by one as input images according to the time sequence. If the input image is empty, the whole process is stopped;
(6) target localization
And (5) inputting the obtained image into a position returning network, and after forward processing of the network, obtaining 8 × 8 × 8 relative position data by a network output layer. The target is located by the calculation processing of these 8 × 8 × 8 node data of the output layer. Let Ai,jAn (i, j) -th image block representing the current input image,
Figure BDA0001607723120000031
representing image blocks Ai,jThe abscissa of the center point is the axis of the circle,
Figure BDA0001607723120000032
representing image blocks Ai,jOrdinate of center point, image block Ai,jThe corresponding 8 node values in the regression network output layer are respectively
Figure BDA0001607723120000033
Figure BDA0001607723120000034
The difference values are respectively the difference value between the abscissa of the upper left corner of the target frame and the abscissa of the image block center point, the difference value between the ordinate of the upper left corner of the target frame and the ordinate of the image block center point, the difference value between the abscissa of the upper right corner of the target frame and the abscissa of the image block center point, the difference value between the abscissa of the lower left corner of the target frame and the abscissa of the image block center point, the difference value between the ordinate of the lower left corner of the target frame and the ordinate of the image block center point, the difference value between the abscissa of the lower right corner of the target frame and the abscissa of the image block center point, and the difference value between the ordinate of the lower right corner of the target frame and the ordinate of the image block center point.
Setting target positionIs shown as
Figure BDA0001607723120000035
Wherein
Figure BDA0001607723120000036
Figure BDA0001607723120000037
Respectively represents the abscissa and ordinate of the upper left corner of the target frame, the abscissa and ordinate of the upper right corner, the abscissa and ordinate of the lower left corner, and the abscissa and ordinate of the lower right corner. The image block Ai,jThe predicted target position is
Figure BDA0001607723120000038
Figure BDA0001607723120000039
I.e. for Ai,jIs provided with
Figure BDA00016077231200000310
Figure BDA00016077231200000311
Similarly, each image block has a respective predicted target position, and the coordinates of the four corners of the predicted target frame are usually different, so that the coordinates of the four corners of the predicted target frame of each image block need to be statistically analyzed, and a coordinate accumulation method is adopted to determine the final coordinates of each corner of the target frame, so as to locate the whole target. The concrete method comprises the following steps of
Figure BDA0001607723120000041
Respectively representing the coordinate accumulation matrixes of the upper left corner, the upper right corner, the lower left corner and the lower right corner of the target frame, wherein
Figure BDA0001607723120000042
Figure BDA0001607723120000043
Are respectively the corresponding moment thereofThe values of the matrices at (a, b), with 0 ≦ a, b ≦ 224, and each element value of these matrices is initially 0. For image block Ai,jIs provided with
Figure BDA0001607723120000044
Figure BDA0001607723120000045
Figure BDA0001607723120000046
Whereby for each image block the four matrices are accumulated.
Finally, the coordinates of the element with the maximum value in the matrix are respectively and correspondingly used as the coordinates of the four corners of the target frame, namely
Figure BDA0001607723120000047
Figure BDA0001607723120000048
Figure BDA0001607723120000049
Wherein
Figure BDA00016077231200000410
Figure BDA00016077231200000411
Respectively representing the horizontal and vertical coordinate values corresponding to the element with the maximum value in the coordinate accumulation matrixes of the upper left corner, the upper right corner, the lower left corner and the lower right corner of the target frame, and finishing target positioning;
(7) network update
And (4) according to the target position obtained in the step (6), 8 × 8 × 8 relative positions between 8 × 8 image blocks divided from the whole image and the target are calculated, a group of training data is formed together with the current input image, network training is performed for one time, fine tuning and updating of the network are achieved, and then the step (5) is skipped.
The relative position value comprises a difference value between an abscissa of the upper left corner of the target frame and an abscissa of the center point of the image block, a difference value between an ordinate of the upper left corner of the target frame and an ordinate of the center point of the image block, a difference value between an abscissa of the upper right corner of the target frame and an abscissa of the center point of the image block, a difference value between an ordinate of the upper right corner of the target frame and an ordinate of the center point of the image block, a difference value between an ordinate of the lower left corner of the target frame and an ordinate of the center point of the image block, a difference value between an abscissa of the lower right corner of the target frame and an abscissa of the center point of the image block, and a difference value between the ordinate of the.
The advantages and positive effects are as follows: the method comprises the steps of firstly constructing a position regression network, wherein the network consists of four parts, namely an image input layer, a migration network for feature expression, a network layer containing 4096 x 1 nodes and a position output layer containing 8 x 8 nodes. The image input layer carries out uniform preprocessing on input images, the images are normalized to be 224 x 224 pixel size, the migration network is a pre-training network VGG-19, the 23 th layer of the pre-training network is used as a feature expression layer, then a network layer containing 4096 x 1 nodes is fully connected, and the network layer is fully connected with a position output layer containing 8 x 8 nodes. In order to effectively train the position regression network, a plurality of transformations are carried out on the target and the image, then a corresponding training data set is synthesized, and the network training is carried out by adopting a classical random gradient descent method. The input image is processed forward by the regression network to obtain the prediction of the target position for each image block divided by the whole image, wherein the prediction comprises 8 × 8 × 8 relative positions and corresponding to the relative coordinate values of the four corners of the target frame. Therefore, the target can be positioned by adopting a coordinate accumulation method, and tracking is further realized. In addition, after the target positioning is finished each time, the network is finely adjusted and updated according to the currently determined target position, so that the network has certain synchronous adjustment capability. The invention can process complex tracking scene and realize accurate target tracking by utilizing the strong characteristic expression ability of deep learning, and simultaneously, the regression-based method avoids a large amount of position search, greatly improves the target positioning speed and can realize real-time target tracking. In addition, the method can be used for single-target tracking, and can also be expanded to be used for multi-target tracking by correspondingly improving the network (such as output end).
Drawings
FIG. 1 is a block diagram of the present invention.
FIG. 2 is a flow chart of the present invention.
Detailed Description
The method can be used in various occasions of target tracking, such as intelligent video analysis, automatic man-machine interaction, traffic video monitoring, unmanned vehicle driving, biological population analysis, field animal motion analysis, crossing moving object detection, fluid surface speed measurement and the like.
Take intelligent video analysis as an example: the intelligent video analysis comprises a plurality of important automatic analysis tasks such as behavior analysis, abnormal alarm, video compression and the like, and the basis of the tasks is to perform stable target tracking. The tracking method can be realized by adopting the tracking method provided by the invention, specifically, firstly, a position regression network based on transfer learning is established, as shown in figure 1, then, a plurality of transformations are carried out on the target and the image, then, a corresponding training data set is synthesized, the network training is carried out by adopting a classical random gradient descent method, and the network can obtain the capability of positioning the target after the training is finished. In the tracking process, the regression network carries out forward processing on an input image, the network outputs the relative position information of a target corresponding to the image, and the target position can be statistically analyzed and positioned by adopting a coordinate accumulation method according to the information, so that tracking is realized. In addition, after the target positioning is finished each time, the network is finely adjusted and updated according to the currently determined target position, so that the network has certain synchronous adjustment capability. The invention can process complex tracking scene and realize accurate target tracking by utilizing the strong characteristic expression ability of deep learning, and simultaneously, the regression-based method avoids a large amount of position search, greatly improves the target positioning speed and can realize real-time target tracking. In addition, the method can be used for single-target tracking, and can also be expanded to be used for multi-target tracking by correspondingly improving the network (such as output end).
The method comprises the steps of firstly establishing a position regression network based on transfer learning, then carrying out various transformations on a target and an image, further synthesizing a corresponding training data set, carrying out network training by adopting a classical random gradient descent method, and obtaining the capability of positioning the target by the network after the training is finished. In the tracking process, the regression network carries out forward processing on an input image, the network outputs the relative position information of a target corresponding to the image, and the target position can be statistically analyzed and positioned by adopting a coordinate accumulation method according to the information, so that tracking is realized. In addition, after the target positioning is finished each time, the network is finely adjusted and updated according to the currently determined target position, so that the network has certain synchronous adjustment capability.
The method can be realized by programming in any computer programming language (such as C language), and the tracking system software based on the method can realize real-time target tracking application in any PC or embedded system.

Claims (2)

1. A target tracking method based on a transfer learning regression network comprises the following steps of utilizing a VGG-19 network, and being characterized in that:
(1) target selection
Selecting and determining a target object to be tracked from the initial image, wherein the target selection process is automatically extracted by a moving target detection method or manually specified by a human-computer interaction method;
(2) target position regression network construction based on block prediction
The target position regression network based on block prediction is composed of four parts, namely an image input layer, a migration network for feature expression, a network layer containing 4096 x 1 nodes and a position output layer containing 8 x 8 nodes; in the whole network, an input image is used as input data of a VGG-19 network after being subjected to scale normalization with the size of 224 multiplied by 224 pixels, the 23 th layer of the VGG-19 network is fully connected with the network layer of 4096 multiplied by 1 nodes, namely the 23 th layer of the VGG-19 is adopted to carry out feature expression on the input image, and the network layer of the 4096 multiplied by 1 nodes is fully connected with the position output layer of 8 multiplied by 8 nodes;
dividing an input image with the size of 224 × 224 into 8 × 8-64 image blocks, wherein each image block is 28 × 28 pixels in size, the position of each image block corresponds to the position of a node in the first two dimensions of the position output layer, and 8 nodes in the third dimension of the position output layer represent the relative positions of targets predicted by the corresponding image blocks, so that each input image passes through a regression network to obtain 8 × 8 × 8 relative position values;
(3) trace-oriented training data set generation
To be able to train a location-homing network, the training data is here acquired by two aspects: on one hand, for a first frame input image, extracting corresponding image blocks according to a target to be tracked, then placing the target image blocks at any position of the first frame image in a manual synthesis mode to generate a new image, filling the area where the original target image blocks are located by using the mean value of the target image blocks, simultaneously recording the position where the target image blocks are placed and calculating 8 x 8 relative positions between 8 x 8 image blocks divided by the whole image and the target, using the position coordinate data as expected output of a network, and forming a group of training data together with the image; on the other hand, the extracted target image block is firstly transformed, including operations such as translation, rotation, distortion and shielding, and then is placed in the image according to the same method and a training image is synthesized; all the training data form a training data set and are then used for network training;
(4) network training
In the network training process, images used for training are input one by one, parameters of a VGG-19 network part are kept unchanged, and connection parameters between a 23 th layer of the VGG-19 network, a position output layer containing 8 x 8 nodes and a network layer containing 4096 x 1 nodes are trained by adopting a classical random gradient descent (SGD);
(5) image input
Under the condition of real-time processing, extracting a video image which is acquired by a camera and stored in a storage area as an input image to be tracked; under the condition of off-line processing, decomposing an acquired video file into an image sequence consisting of a plurality of frames, and extracting frame images one by one as input images according to a time sequence; if the input image is empty, the whole process is stopped;
(6) target localization
Inputting the image obtained in the step (5) into a position returning network, after the forward processing of the network, obtaining 8 multiplied by 8 relative position data by a network output layer, and positioning the target by computing the 8 multiplied by 8 node data of the output layer; let Ai,jAn (i, j) -th image block representing the current input image,
Figure FDA0001607723110000021
representing image blocks Ai,jThe abscissa of the center point is the axis of the circle,
Figure FDA0001607723110000022
representing image blocks Ai,jOrdinate of center point, image block Ai,jThe corresponding 8 node values in the regression network output layer are respectively
Figure FDA0001607723110000023
Figure FDA0001607723110000024
The difference values are respectively the difference value between the abscissa of the upper left corner of the target frame and the abscissa of the center point of the image block, the difference value between the ordinate of the upper left corner of the target frame and the ordinate of the center point of the image block, the difference value between the abscissa of the upper right corner of the target frame and the ordinate of the center point of the image block, the difference value between the abscissa of the lower left corner of the target frame and the abscissa of the center point of the image block, the difference value between the ordinate of the lower left corner of the target frame and the ordinate of the center point of the image block, the difference value between the abscissa of the lower right corner of the target frame and the abscissa of the center point of the image block, and the difference value between the ordinate of the lower right corner of the target frame and the ordinate of the center point of the image block;
the target position is expressed as
Figure FDA0001607723110000025
Wherein
Figure FDA0001607723110000026
Figure FDA0001607723110000027
Respectively representing the abscissa and ordinate of the upper left corner of the target frame, the abscissa and ordinate of the upper right corner, the abscissa and ordinate of the lower left corner and the abscissa and ordinate of the lower right corner; the image block Ai,jThe predicted target position is
Figure FDA0001607723110000028
Figure FDA0001607723110000029
I.e. for Ai,jIs provided with
Figure FDA00016077231100000210
Figure FDA00016077231100000211
Each image block has a respective predicted target position, and the coordinates of the four corners of the predicted target frame are usually different, so the coordinates of the four corners of the predicted target frame of each image block need to be statistically analyzed, and a coordinate accumulation method is adopted to determine the final coordinates of each corner of the target frame, so that the whole target is positioned; the concrete method comprises the following steps of
Figure FDA00016077231100000212
Respectively representing the coordinate accumulation matrixes of the upper left corner, the upper right corner, the lower left corner and the lower right corner of the target frame, wherein
Figure FDA00016077231100000213
Figure FDA00016077231100000214
The values of the corresponding matrixes at (a and b) are respectively, 0 is less than or equal to a, b is less than or equal to 224, and each element value of the matrixes is 0 initially; for image block Ai,jIs provided with
Figure FDA00016077231100000215
Figure FDA00016077231100000216
Figure FDA00016077231100000217
Thus for each image block, the four matrices are subjected to an accumulation operation;
finally, the coordinates of the element with the maximum value in the matrix are respectively and correspondingly used as the coordinates of the four corners of the target frame, namely
Figure FDA0001607723110000031
Figure FDA0001607723110000032
Figure FDA0001607723110000033
Wherein
Figure FDA0001607723110000034
Figure FDA0001607723110000035
Respectively representing the horizontal and vertical coordinate values corresponding to the element with the maximum value in the coordinate accumulation matrixes of the upper left corner, the upper right corner, the lower left corner and the lower right corner of the target frame, and finishing target positioning;
(7) network update
And (4) according to the target position obtained in the step (6), 8 × 8 × 8 relative positions between 8 × 8 image blocks divided from the whole image and the target are calculated, a group of training data is formed together with the current input image, network training is performed for one time, fine tuning and updating of the network are achieved, and then the step (5) is skipped.
2. The target tracking method based on the transfer learning regression network according to claim 1, characterized in that: the relative position value comprises a difference value between an abscissa of the upper left corner of the target frame and an abscissa of the center point of the image block, a difference value between an ordinate of the upper left corner of the target frame and an ordinate of the center point of the image block, a difference value between an abscissa of the upper right corner of the target frame and an abscissa of the center point of the image block, a difference value between an ordinate of the upper right corner of the target frame and an ordinate of the center point of the image block, a difference value between an ordinate of the lower left corner of the target frame and an ordinate of the center point of the image block, a difference value between an abscissa of the lower right corner of the target frame and an abscissa of the center point of the image block, and a difference value between the ordinate of the lower right corner of the target frame and the ordinate of the center point of the image block.
CN201810250785.6A 2018-03-26 2018-03-26 Target tracking method based on transfer learning regression network Expired - Fee Related CN108537825B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810250785.6A CN108537825B (en) 2018-03-26 2018-03-26 Target tracking method based on transfer learning regression network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810250785.6A CN108537825B (en) 2018-03-26 2018-03-26 Target tracking method based on transfer learning regression network

Publications (2)

Publication Number Publication Date
CN108537825A CN108537825A (en) 2018-09-14
CN108537825B true CN108537825B (en) 2021-08-17

Family

ID=63484603

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810250785.6A Expired - Fee Related CN108537825B (en) 2018-03-26 2018-03-26 Target tracking method based on transfer learning regression network

Country Status (1)

Country Link
CN (1) CN108537825B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109493370B (en) * 2018-10-12 2021-07-02 西南交通大学 Target tracking method based on space offset learning
CN111127510B (en) * 2018-11-01 2023-10-27 杭州海康威视数字技术股份有限公司 Target object position prediction method and device
CN110162475B (en) * 2019-05-27 2023-04-18 浙江工业大学 Software defect prediction method based on deep migration
CN113192062A (en) * 2021-05-25 2021-07-30 湖北工业大学 Arterial plaque ultrasonic image self-supervision segmentation method based on image restoration

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103310466B (en) * 2013-06-28 2016-02-17 安科智慧城市技术(中国)有限公司 A kind of monotrack method and implement device thereof
US9928405B2 (en) * 2014-01-13 2018-03-27 Carnegie Mellon University System and method for detecting and tracking facial features in images
US10303977B2 (en) * 2016-06-28 2019-05-28 Conduent Business Services, Llc System and method for expanding and training convolutional neural networks for large size input images
CN107146237B (en) * 2017-04-24 2020-02-18 西南交通大学 Target tracking method based on online state learning and estimation
CN107452023A (en) * 2017-07-21 2017-12-08 上海交通大学 A kind of monotrack method and system based on convolutional neural networks on-line study

Also Published As

Publication number Publication date
CN108537825A (en) 2018-09-14

Similar Documents

Publication Publication Date Title
CN110660082B (en) Target tracking method based on graph convolution and trajectory convolution network learning
Postels et al. Sampling-free epistemic uncertainty estimation using approximated variance propagation
CN109800689B (en) Target tracking method based on space-time feature fusion learning
Mukhoti et al. Evaluating bayesian deep learning methods for semantic segmentation
Wang et al. Predrnn: Recurrent neural networks for predictive learning using spatiotemporal lstms
CN108537825B (en) Target tracking method based on transfer learning regression network
CN111626128B (en) Pedestrian detection method based on improved YOLOv3 in orchard environment
CN107146237B (en) Target tracking method based on online state learning and estimation
CN108416266B (en) Method for rapidly identifying video behaviors by extracting moving object through optical flow
CN109993770B (en) Target tracking method for adaptive space-time learning and state recognition
CN112348849A (en) Twin network video target tracking method and device
CN114022759A (en) Airspace finite pixel target detection system and method fusing neural network space-time characteristics
CN107680116A (en) A kind of method for monitoring moving object in video sequences
CN110827320B (en) Target tracking method and device based on time sequence prediction
CN113221787A (en) Pedestrian multi-target tracking method based on multivariate difference fusion
CN109493370B (en) Target tracking method based on space offset learning
CN107798329B (en) CNN-based adaptive particle filter target tracking method
DE102022210129A1 (en) IMAGE PROCESSING VIA ISOTONIC CONVOLUTIONAL NEURONAL NETWORKS
CN109272036B (en) Random fern target tracking method based on depth residual error network
CN110111358B (en) Target tracking method based on multilayer time sequence filtering
CN112967267B (en) Laser directional energy deposition sputtering counting method of full convolution neural network
CN114581485A (en) Target tracking method based on language modeling pattern twin network
Wang et al. Unsupervised Defect Segmentation in Selective Laser Melting
Zhang et al. An Improved Detection Algorithm For Pre-processing Problem Based On PointPillars
Zhou et al. MoD-SLAM: Monocular Dense Mapping for Unbounded 3D Scene Reconstruction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210817